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PREFACE 


This publication is a compilation of the papers presented at the Symposium on 
Multigrid Methods held at Ames Research Center on October 21-22, 1981. The papers 
represent an international sampling of the most recent developments in numerical 
solution of certain types of partial differential equations by rapidly converging 
sequences of operations on supporting grids that range from very fine to very coarse. 

The symposium was organized for the purpose of bringing together scientists 
having individual experience with, and a common interest in, multigrid processes. 

For the most part the common ground of these processes is an underlying matrix that 
is either precisely, or "close to," one which is positive definite, diagonally domi- 
nant, and similar to the Laplacian. Considerable progress has been made in identify- 
ing processes that have this common ground, in standardizing techniques best suited 
for optimizing their solution, and in extending these techniques to processes that 
have slight deviations from the standard. 

At present, published material that has shown the most dramatic success in pro- 
viding rapid convergence is limited to physical problems related to the incompres- 
sible Navier-Stokes equations or the irrotational forms of the Euler equations 
(potential or Cauchy- Riemann formulations). It is hoped that this publication will 
provide further knowledge and information that can be applied to the solution of com- 
pressible Navier-Stokes equations. 
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MULTILEVEL TECHNIQUES FOR NONELLIPTIC PROBLEMS* 


Dennis C. Jespersen 

Department of Mathematics, Oregon State University 


SUMMARY 

Multigrid and multilevel methods have gained much attention lately and show 
great promise for the solution of elliptic problems. This paper sets up a framework 
for analyzing these methods with a view to extending their applicability to non- 
elliptic problems. A simple nonelliptic problem is given, and it is shown how a 
multilevel technique can be used for its solution. Emphasis is on **smoothness" 
properties of eigenvectors and attention is drawn to the possibility of "condition- 
ing" the eigensystem so that eigenvectors have the desired smoothness properties. 


1 . INTRODUCTION 


The purpose of this paper is to investigate the applicability of multigrid and 
multilevel methods for the numerical solution of partial differential equations that 
are not of the elliptic type. First an exposition of one means of analyzing "classi- 
cal" multigrid techniques will be presented and then extended to deal with some 
simple nonelliptic problems. Throughout, the emphasis will be on analyzing and under- 
standing how the various components of a successful multilevel process fit together, 
and not on the solution of practical problems. In particular, all the model problems 
presented here will be set in one space dimension. The analysis will concentrate on 
linear problems, though some remarks will be made on nonlinear problems; a thorough 
understanding of the linear case is an essential prerequisite for tackling nonlinear 
problems. The focus of the analysis will be on eigenvalues and eigenvectors. 

A brief description of a standard multigrid procedure is as follows. We are to 
approximate the solution of some partial differential equation. Discretize the prob- 
lem on a grid using (say) finite differences, giving a (large) linear system 

- 'b 

(Here the subscript b stands for "basic.") Some preconditioning procedure may be 
applied to this system, for example premultiplication of both sides of (1) by a 
matrix C, giving the system 


A^u^ = f^ (2) 

where A^ := C A|^, f^ := C f^, and u^ := On the grid F^, let Ug^ be some 

initial guess at the solution Perform a few steps of some relaxation process, 

say 




:= G(uQ^,f^) 


( 3 ) 


*Funds for the support of this study have been allocated by the Ames Research 

Center, NASA, Moffett Field, California, under Interchange No. NCA2-OR586-001. 



( 4 ) 


where G denotes the relaxation process. Let r^, the residual, be defined by 
r^: = Now, (1) is equivalent to 

for the solution u^^ is then given by u^^^ = Un^ + e^. In order to solve (4), 
somehow transfer the problem to a coarser grid F^, solve 

A^e^ = r^ (5) 

on the coarser grid, transfer e^ back to the fine grid, and replace u^^ by u^^ 
plus the transferred e^ . These transfers can be formalized by denoting by and 
^restriction” operators from grid functions on F^ to grid functions on F^, and 
and "interpolation” operators from grid functions on F^ to grid functions on 
r^. The transfer operators are then used to define A^ ;= R^A^I^ and r^ := R^r^; 
the problem on the coarser grid is then A^e^ = r^, and e^ is defined as 
e^: = I^e^ once e^ has been found. In a true multilevel process would be 

approximated by using still coarser grids, but the basic ideas can be seen in the 
analysis of the two-level process. The keys to constructing a successful process are 
the relaxation process G, the restriction operators R^ and R^, and the interpola- 
tion operators and 

The framework of the analysis is similar to that of Brandt (1977), Hackbusch 
(1978), McCormick (1979), Wesseling (1980), and Frederickson (1975). The operator 
here called "restriction” (from functions on the fine grid to functions on the coarse 
grid) is called an "averaging operator" by McCormick, an "interpolation operator” by 
Brandt, a "collection operator" by Frederickson, and a "restriction operator” by 
Hackbusch and Wesseling. The operator here called "interpolation" is called "inter- 
polation" by McCormick and Brandt, and is called a "prolongation operator" by 
Hackbusch and Wesseling. Previous authors have taken R^ = R^, = 1^, and either 

r 1 = (ii)T or = (constant)*(I^)^. 

It should be noted that the difference between the computed solution and the 
exact solution, that is, the error, is not the focus of this work. It will always be 
assumed that an appropriate discretization of the partial differential equation has 
been derived and a discrete set of equations obtained. The job then is to solve the 
discrete set of equations very efficiently. 


SYMBOLS 


B(a,b,c) tridiagonal matrix with b on main diagonal, a on subdiagonal, and c 
on superdiagonal 

e error (exact solution of discrete equations minus some approximation) 

G relaxation operator 

interpolation operator from grid i + 1 to grid i 

P permutation matrix (a matrix whose entries are either 0 or 1 , and such that 

each row and column has exactly one 1) 

R^ restriction operator from grid i to grid i + 1 
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r 


residual (f - Au) 


:= quantity on left is defined as quantity on the right 

Superscript: 

i relates to the ith grid (grid 1 = finest grid) 

Subscript: 

n nth step of an iteration process on a fixed grid 


2. A RELAXATION PROCESS 


In this section we will be working on a fixed grid V and will therefore omit 
the superscripts denoting the grid level. A basic component of a multilevel scheme 
is the relaxation process. The relaxation process we will consider is the simple and 
general Richardson process. Given the linear system 

Au = f (6) 

and an initial guess Uq » one step of the Richardson process with step-size h is 
defined by 

Ui := Uq + h(Auo - f) (7) 

The common relaxation processes, such as the Jacobi, Gauss-Seidel, SOR, ADI, and 
their block variants, can all be written as the Richardson process applied to equa- 
tion (6) or to a preconditioned form of equation (6). For example, the Gauss-Seidel 
process is the Richardson procedure with h = 1 applied to 

(L + D)“^Au = (L + D)"^f 

where L and D are the lower triangular and diagonal parts of A, respectively. 

A few elementary remarks about the algorithm given in equation (7) are in order. 
If equation (7) is iterated, that is, 

u , := u + h(Au - f) for n > 0 

n+i n n 

it is a standard result that u^ converges to the exact solution u^ of equation (6) 
for any initial guess uq » if and only if the spectral radius of the iteration matrix 
I + hA is less than 1. Suppose now we allow the step sizes to vary, that is, 

u . := u + h (Au - f) for n > 0 (8) 

n+i n n n 

If A has a complete set of eigenvectors {v^}, say = X^jjV^, then the error 

^ satisfies e^+i = (I + b^A)e^ and so, if ^ ^^m» then 

e = n(l+h.X ^av 
n ^ 3 ta / m m 

m \j = o / 
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(9) 



From equation (9) it is evident that the error can (in theory) be reduced to 
zero in a finite number of steps by choosing a sequence of step sizes such that each 
h is -1 divided by an eigenvalue of A. This is usually quite impractical. We 
might consider trying to choose the step sizes to be approximately -1 divided by an 
eigenvalue of A. This is more practical; indeed, if hp ~ then the component 

of e^ in the direction of v^ has been greatly reduced. Pursuing this idea 
further leads one to study the possibility of choosing a set of step sizes 
{hj, 0<j ^n~ 1} for some given n such that the polynomial 


n-i 

p(z) := n (1 + h.z) 

has minimal modulus for z in some set in the complex plane that contains the nega- 
tive reciprocals of the eigenvalues of A. This leads to the study of Chebyshev 
iterative methods, a topic that has been explored by many authors (for recent work in 
this area see Manteuffel, 1977; McDonald, 1980). Instead of carrying out a full 
Chebyshev process, the plan here is to use the nonstationary Richardson process to 
reduce the components of the error in some directions v^ and then to proceed with 
another idea. 


It will be important that no error component be magnified by the Richardson 
process. From equation (9), we see that this requires [l + hX^| < 1 for all eigen- 
values of A, so that h is in the intersection of the disks in the complex 

plane with centers -l/X^ and radii |l/^inl* Alternatively, given h we ask that 
all eigenvalues of A lie in the disk with center -1/h and radius |l/h|. We 
might call this disk the stability region for step size h (see fig. 1). 



Figure 1.- Stability region for 
step-size h. 


To have a large stability region (and 
thereby have the stability region contain all 
the eigenvalues of A, if the eigenvalues of 
A are widely separated), we see that h will 
have to be small. If h is small, the eigen- 
vectors that are substantially diminished by 
one Richardson sweep with step size h are 
the eigenvectors associated with eigenvalues 
of modulus |l/h|, that is, eigenvalues of 
large modulus. Eigenvectors associated with 
eigenvalues of small modulus are hardly dimin- 
ished by a Richardson step with small |h|. 
Loosely speaking, we might say that large 
eigenvalues are easy to eliminate and that 
small eigenvalues are difficult to eliminate 
(stably). The first part of the overall 
process will be the diminishing of (the eigen- 
vectors associated with) the eigenvalues of 
large modulus by one or more Richardson steps 
with appropriately chosen step sizes. 


If the original partial differential equa- 
tion is not self-adjoint, the matrix A will probably have nonreal eigenvalues. To 
annihilate (nearly) the error component in the direction of an eigenvector associated 
with a nonreal eigenvalue, the step size h would have to be complex. In the inter- 
ests of avoiding complex arithmetic, however, we note that, if A is a real matrix 
(as it is in most applications), then its eigenvalues come in complex conjugate pairs. 
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This leads one to consider a variant of the Richardson process in which a step with 
complex h is immediately followed by a step with the complex conjugate h. The 
equations are thus 


Uj /2 ^0 (10a) 

:= ^x/2 (10b) 

Substituting equation (10b) into (10a) gives 

Ui = Uq + 2(Re h)(Auo - f) + |h|^A(Auo - f) (11) 

where Re h denotes the real part of h. One notes that equation (11) can be car- 
ried out in real arithmetic and still annihilate the error components in the direc- 
tions of the complex eigenvectors associated with X := -1/h and X. If h is a good 
approximation to -1/X for some eigenvalue X, then the error component in the 
direction of the eigenvectors associated with X and X will be substantially reduced 
(reduced by a factor of |l + Xhp)* 

It is again important to inquire into the stability properties of the iteration. 
Given a (complex) step size h, the set of eigenvectors that are not amplified by 
equation (11) is the set of eigenvectors belonging to eigenvalues X such that 

|l + Xh| |l + Xh| < 1, that is, (x+l/hj |x+l/hl <l/|hj^ 

This defines a region in the complex plane 
whose boundary is called an oval of Cassini; 
in the special case when Re h = 0, this 
reduces to a lemniscate (two-leaved rose; 
fig. 2). 

Again we see that to have a large sta- 
bility region, |h| will have to be small, so 
that the (eigenvectors associated with) 
eigenvalues of large modulus will be easy to 
diminish, and the (eigenvectors associated 
with) eigenvalues of small modulus will be 
difficult to diminish. Also note that for a 
small |h| the Richardson step 
u^_l_^ = ^ h(Au^ - f) is the explicit Euler 
method applied to the time integration of 
du/dt = Au - f. Time-like methods will reach 
steady-state solutions, but will require a 
large amount of computation to do so. If we 
are interested in only the steady-state 
solution, we are free to use methods that are 
not time accurate. 

To conclude this section, the generality 
of the Richardson process should again be 
emphasized- As noted above, the usual itera- 
tive methods can be written in the form of a 
Richardson process. The considerations of 
eigenvector decomposition lead us to use the 



Figure 2.- Oval of Cassini. 
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following as a rule of thumb: The large eigenvalues are easy to eliminate* The 
difficult problem is to stably and rapidly eliminate small eigenvalues. We will 
attempt jto do this by combining a Richardson process with a multilevel procedure. 


3. A MODEL ELLIPTIC PROBLEM 


The goal of this paper is the study of multilevel methods as applied to nonellip- 
tic problems. Nevertheless, in this section we will look at a model one-dimensional 
elliptic problem and show how the analysis proceeds. The hope is that looking at 
this familiar problem will give confidence in the analytical techniques when they are 
used to study nonelliptic problems. Consider then the simple problem (which was also 
used as a model problem by Hackbusch, 1978): 


u**(x) = f(x), 0 < X < 1 

u(0) = 0, u(l) = 0 


( 12 ) 


We discretize equation (12) on a uniform grid with points xj := jAx, 

0 < j < M + 1 where M is an odd integer and Ax := 1/(M + 1). The standard 
second-order centered-dif ference scheme for (12) gives the linear system A^u^ = f^, 
where A^ is the M by M tridiagonal matrix B(l,-2,1), and f is the vector with 
entries f(xj)(Ax)^,l < j < M. (Recall that B(l,-2,1) denotes the tridiagonal 
matrix with -2 on the main diagonal and I’s on the sub- and superdiagonals.) The 
eigenvalues of A^ are 


:= -2{1 - cos[m 7 r/(M + 1)]} = -4 sin^[m7r/(2M +2)], 1 < m < M (13) 


with corresponding eigenvectors Vjj^ where 

(v^)j = sin[jm7r/(M +1)], 1 < j ,m < M 


(14) 


Note that the eigenvectors associated with eigenvalues of large modulus are 
highly oscillatory, and the eigenvectors associated with eigenvalues of small modulus 
are ’’smooth.” Thus, given some initial guess, a few Richardson sweeps with appro- 
priate step sizes will substantially reduce the ’’high-frequency” component of the 
error. This idea seems to be one of the motivating ideas for the multigrid method 
for elliptic equations; high-frequency components of the error correspond to eigen- 
values of large modulus and are easy to diminish by an appropriate relaxation tech- 
nique which is thought of as ’’smoothing.” (We will see that use of the term ’’smooth- 
ing” may not be appropriate for nonelliptic problems; the term ’’relaxation” will be 
used in this paper.) 

After a few relaxation steps the residual r^ = f^ - A^G(uQ^,f^) should consist 
mostly of ’’low frequencies” and be well representable on a coarser grid. Letting 
denote the initial fine grid and denote the grid with mesh spacing 2Ax, a 

restriction operator from functions on the fine grid to functions on the coarse grid 
is representable as a mapping from to r(M-i)/^^ Euclidean M-space to Euclidean 

(M - 1) /2-space. 

For example, if we simply transfer the even-subscripted components of the 
residual r^ to the grid , then is the (M - l)/2 by M matrix 
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“0 1 0 0 0 0 

0 0 0 1 0 0 

0 0 0 0 0 1 


( 15 ) 


Another possibility is a linear averaging procedure, which would give 


1/4 

1/2 

1/4 

0 

0 

0 

0 

0 

0 

1/4 

1/2 

1/4 

0 

0 

0 

0 

0 

0 

1/4 

1/2 

1/4 


(16) 


The interpolation operator from functions on the coarse grid to functions 

on the fine grid is a mapping from r(M~^)/^ to R?^. One possibility is linear inter- 
polation from the even-numbered grid points to the odd-numbered grid points. In this 
case, the matrix representation of would be 


1/2 0 0 

10 0 
1/2 1/2 0 

0 1 0 

0 1/2 1/2 

0 0 1 

0 0 1/2 


(17) 


Another possibility is an implicit cubic polynomial interpolation, defined as 
follows. Consider a set of data {(x4,yj):l < j < M} , where we assume yj is known 
for j even and unknown for j odd tana Xj := jAx). For 3 < k < M - 2 and k odd, 
let Pk(t) be the polynomial of degree at most 3 which satisfies Pk(^j) “ yj 
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j = k - 2,k - l,k + l,k + 2. Define := PkC^k)* Thus yy^ is linearly related 

to the known values yk-i > yk+i unknown values yk- 2 » Yk +2 • k = 1 , 

let Pi(t) be the polynomial of degree at most 3 which satisfies Pi (0) = 0, 

P^(x 2 ) = 72, PiCx,) = Yg, Pi(x^) == y^; define y^ := PiCx^). Define y^.j in a 
similar manner, ^Note that the definition of would have to be modified if the 

left-hand boundary condition were of the Neumann type; if the boundary condition for 
the differential equation were u* (0) = 0 then would be required to satisfy 

p[(0) = 0 instead of p^CO) =0.) The result is a linear system 

By jj = Cy (18 

■^odd -^even 


rp 

where yodd •= (yi»y3» • • • »yM^ » yeven •= (y2»yn» 
(M + l)/2 by (M + l)/2 tridiagonal matrix: 


1 1 


16 1 


B = 


1 6 1 




16 1 
1 1 


and C is the (M + l)/2 by (M - l)/2 tridiagonal matrix: 

~3/2 1/4 

4 4 

4 4 


(19) 


( 20 ) 


4 4 

1/4 3/2 


For the cubic interpolation, one would then define 


;= P 


[b^] 


where P is the "even-odd" permutation matrix, which satisfies 


P(Vi, 


•V = 


V 


and I is the identity matrix of order (M - l)/2. 
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In either case, the coarse-grid problem is defined to be = R^r^, and 

then e^ is approximated by I^e^. One can calculate that with as above, R^ 

given by equation (15), and given by (17), the matrix R^A^I^ is the 

(M “ l)/2 by (M - l)/2 matrix: 

R^A’-I^ = -i B(-l,2,-l) (21) 


This is (up to a scaling factor) the difference operator that would have resulted 
from discretizing the original differential operator on the coarser grid , 


One complete step of the two-level process is defined by u^^^ := I^e^ + G^u^^. 
This can be written as a matrix iterative process as follows. Let P be the even- 
odd permutation matrix defined above. The problem 

A^e^ = 


is equivalent to the problem 

(PA^P^HPe^) = Pr^ 


which is, in block form. 



( 22 ) 


(23) 


where subscript a refers to the even unknowns and the subscript b refers to the 
odd unknowns. 


The first block equation of equation (23) reads 


+ = r. 


(24) 


If we make the approximation e^,^ = la b®a^ some operator then (24) 

becomes 


which is equivalent to (R^A^l^)e^ = R^r^ in case the operator R^ is given by (15); 
the matrix I 3 can be given by (for example) 

f" 1 

1/2 1/2 



( 26 ) 


1/2 1/2 
1 / 2 . 
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which corresponds to linear interpolation as in (17), or by 

, = B”^C (27) 

3. 9 D 

where B and C are the matrices from the implicit cubic interpolation process 
defined above. 


After solving (25), we have 





(Aj^ + A2^I^ , )-^ 0 

2 a.b p^i 

, (A, ^ + A.^I^ , )'^ 0 

La,b 1 2 a,b J 


(28) 


= : p'^SPr^ 


and thus 

= p'^SP(f^ - A^G^u/) + 

= (I - p'’-SPA^)G^Ug^ + p'^SPf^ 


(29) 


One step of the iteration process is thus one step of a stationary iterative 
process with iteration matrix 

T := (I - P^SPA^)G^ (30) 


The rate of convergence of the process is controlled by the spectral radius of the 
matrix T. 

The operators and can be identified from (28) as 

= [l|0]P (31) 


and 





(32) 


Other authors (McCormick, 1977; Wesseling, 1980) have suggested that the concii- 
tion = (R^)^ or = (constant)* (R^)^ be enforced; an obvious way to do this 

is to replace (31) by 




( 33 ) 


Numerical tests with both possibilities are reported below. 
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Some numerical experiments were carried out in which the eigenvalues of the 
overall iteration matrix T were computed for different numbers of grid points, dif- 
ferent interpolation and restriction operators, and varying relaxation procedures. 

The relaxation procedure was of the form 

k 

= n (I + h.A^) 

(i.e., the relaxation procedure consisted of k Richardson sweeps), where 
k G {0,1, 2, 3} and he { 1/2 , 1/3 , 1 /4} (note that the stability condition of section 2, 
|l + Xh| < 1, becomes in this case 0 < h < 2/{4 sin^ [mir/ (2M +2)]}, 1 < m < M, so 
1/2 is effectively an upper bound for h. Some results for the cases MG {7,11}, 
given by (32) and (26) ("linear interpolation") or by (32) and (27) ("cubic interpo- 
lation") , and given by (15) or by (33). A selection of results from these com- 

putations is given in table 1. 

From table 1 we see that the total process will not converge if k = 0, that is, 
if no relaxation sweeps are used. (This is also clear from the definition of T and 
from the fact that S is a rank-deficient matrix; see eq. (30).) With linear inter- 
polation, the results with R given by (15) are identical with the results when R 
is given by (33), but we have no formal proof of this. From the output, there seems 
little point in using more than one relaxation sweep with linear interpolation. In 
two of the cases presented, the spectral radius of T is 0; this does not imply that 
the iterative process converges in one step, because, for these cases, it turns out 
that T has nontrivial Jordan blocks in its Jordan canonical form, that is, n by n 
blocks of the form 


0 1 
0 1 


[ ":j 

with n > 1. In these cases, the total process would converge in n steps. 

With cubic interpolation it appears there can be a substantial gain by perform- 
ing more than one relaxation sweep on the fine level. Using the R^ from (33) is 
not as favorable as using R^ from (15). A full investigation of this has yet to be 
undertaken. 

Although not shown in table 1 , computations with higher values of M revealed 
that the spectral radius of T was virtually independent of the number of grid 
points. This is an encouraging sign, and is the typical situation in this classical 
multigrid situation (Brandt, 1977; Hackbusch, 1978), Also, the idea of eigensystem 
mixing (Lomax, 1981, unpublished notes) may serve to further decrease the spectral 
radius of T, leading to even faster convergence of the overall process. Finally, a 
similar analysis can be carried out in the case when the differential equation has a 
boundary condition of the Neumann type; appropriate modifications must be made in the 
definition of the interpolation operator at a Neumann boundary. 
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TABLE 1.- SPECTRAL RADIUS OF TOTAL PROCESS: 
MODEL ELLIPTIC PROBLEM 


Linear 

interpolation 

(The results with 

given by (33) are identical with those when 

is 

given by (15) • ) 



M 

k 

h 

Spectral 

radius of T 

7 

0 

_ 


1.0 


1 

1/2 


0 


1 

1/4 


.5 


2 

1/2, 1/2 


.5 


2 

1/4, 1/4 


.46 


2 

1/2, 1/4 


.43 


3 

1/4, 1/4, 1/4 


.45 


3 

1/2, 1/2, 1/2 


0 


3 

1/2, 1/3, 1/4 


.43 

11 

0 

- 


1.0 


1 

1/4 


.5 


1 

1/2 


0 


3 

1/2, 1/3, 1/4 


.47 


3 

1/4, 1/4, 1/4 


.47 

(In 

. the cases when T 

has spectral radius 0, 

T 

has 

nonlinear elementary divisors, i.e., the 

Jordan 

canonical form 

of T is not diagonal.) 

Cubic interpolation 






Spectral radius of T 

M 

k 

h 

R^ from 

(15) R^ from (33) 

7 

0 

— 

1.0 

1.0 


1 

1/2 

1.0 

.91 


1 

1/4 

.5 

.5 


2 

1/4, 1/4 

.25 

.46 


2 

1/2. 1/4 

.25 

.44 


3 

1/4. 1/4, 1/4 

. 13 

.44 


3 

1/2, 1/3, 1/4 

.08 

.42 

11 

0 

- 

1.0 

1.0 


1 

1/2 

1.0 

.96 


1 

1/4 

.5 

.5 


2 

1/4, 1/4 

.25 

.48 


2 

1/2, 1/4 

.25 

.47 


3 

1/4, 1/4, 1/4 

. 13 

.47 


3 

1/2, 1/3, 1/4 

.08 

.46 
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4. A MODEL NONELLIPTIC PROBLEM 


The preceding section introduced some ideas and set up a framework for analysis 
of a multilevel method for a model elliptic problem. In this section, we will begin 
to investigate the applicability of multilevel methods to nonelliptic problems. The 
model problem to be used is the steady-state version of u^. + u^^ = f. 


u^^ = f, 0<x<l 

U(0) = gg 


(34) 


Introduce grid points Xj :=jAx = j/(M+ 1), 0 < j <M + 1, M odd, and consider 
second-order centered differences (leave aside for now the fact that centered differ- 
ences are probably inappropriate for this particular problem; we have in mind systems 
of equations describing subsonic flow for which centered differencing may be very 
appropriate). The finite-difference equations are 


= 2Axf(Xj) , 1 < j < M - 1 


(35) 


(where Uq go). At the right-hand boundary let us use a linear extrapolation 
^M+i derive from (uj^+j^ - U{^_j^)/2Ax = f (xj^) the equation 

^ - ^-1 = 


We get then the linear system A]^u^ = f-j^, where 

1-1 0 1 




-1 


0 


1 


(37) 


and f |5 2Ax[f(x^), 


-1 0 1 


-1 1 


. ,f (x^^_3^) ,f (x^)/2]^. 


Most of the eigenvalues of A are complex; it is easy to show that all the 
eigenvalues lie in the half-plane Re(z) > 0. Computations with M = 15 revealed 
that, for this case, the eigenvalues lie on a curve from about 0.0026 ±1.96i to 
approximately 0.195. One possible way of treating the linear system A^u^ == f^ is 
to premultiply by (-A-j^"^) , giving the system (-A|^'^A|^)u|^ = -A^"^f, with a coefficient 
matrix that is symmetric and negative definite, thus analogous to the matrices that 
arise when discretizing elliptic equations (see Lomax et al. , 1981). A possible 
problem with this idea is that the condition number of the matrix of the new linear 
system is the square of the condition number of the original matrix, which may lead 
to slow convergence of an iterative technique. In this paper, we wish to stay in the 
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spirit of nonelliptic problems, however, and so we do not consider this possibility 
further. 

Another possibility for treating the linear system is to use some relaxation 
method, such as the complex Richardson technique of section 2, to approximately elim- 
inate the components of the error along the directions of the eigenvectors associated 
with eigenvalues of large modulus, and then to transfer the error equation to a 
coarser grid. For this idea to succeed, it is presumably the case that the eigen- 
vectors associated with eigenvalues of small modulus should not fluctuate rapidly on 
the fine grid, so that they are well representable on the coarse grid. Such is not 
the case for the matrix indeed, can be viewed as a perturbation of the 

matrix B (-1,0,1), which has eigenvalues 

= 2i cos 7 Tk/(M +1) , l<k<M 

and eigenvectors x^^^ with 

x^^) = g-j^^ jTTk/(M +1) , l<j,k<M 

all of which oscillate rapidly on the fine grid. Some of the eigenvectors of Ai^ 
are shown in figure 3; in the figure, the horizontal axis is j, and the vertical 
axis is vj (where A^^v = Xv, v = (v^ , . . . ,vj^)^) . 


X 


X X 



X 

X 

|X I I I I I 

EIGENVECTOR 1; X = 0.195 


X 


X 


X 


X ^ 



X X 

X X 



X 


X 


X X 


X 


X 


1 I 1 I I ^ L I I I I I 

EIGENVECTOR 2 (REAL AND IMAGINARY PARTS); 

X = 0.152 + 0.3321 



X 

X 


X 


X 


X 


X X 


X 


X 


X 


I - 1 1^— I I I I L. ■■ I i. -J I 

EIGENVECTOR 15 (REAL AND IMAGINARY PARTS); 

X = 0.00264 + 1.96i 


Although the eigenvectors oscillate, there 
is evidently a regularity about them which we 
can try to exploit. Let us try to find a sim- 
ple transformation that will change the eigen- 
vectors associated with small eigenvalues from 
rapidly oscillating to slowly oscillating. 
First, let P 2 be the "even-odd reversed" 
permutation which is defined by 


The result of applying P to the eigenvectors 
of figure 3 is shown in figure 4; the eigen- 
vectors associated with eigenvalues of small 
modulus have become much smoother, though 
there is still a pronounced kink. To remove 
this kink, let F be the matrix that satisfies 


(Fv)^ = 


v . 

J 

2v 


if i ^ 


(M+l)/2 


For example. If M = 7, 


(M + l)/2 
if j > 

then F 


(38) 

(M+ l)/2 


is the matrix 


Figure 3.- Eigenvectors of 
(see (37)); M = 15. 
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X 


"l 0 
0 1 
0 0 
0 0 
0 0 
0 0 
0 0 


0 0 0 0 0 

0 0 0 0 0 

1 0 0 0 0 

0 10 0 0 
0 2-100 
020-10 
0 2 0 0 -1 


(Note that = F.) The result of applying 

F to the vectors of figure 4 is shown in 
figure 5. The eigenvectors associated with 
eigenvalues of small modulus are now slowly 
varying, and the eigenvectors associated with 
eigenvalues of large modulus (which we do not 
care about anyway, since their contribution 
to the error is destined to be removed by a 
few Richardson sweeps) are still rapidly 
varying. We might call F a "reflection" 
matrix. 


X 


X 



X 


X 



X X 


X 

X 

X 

1 1 I ! 

EIGENVECTOR 1 


J 


xXX^ 


X 


X 


X 


X 


X 


X 


X 


X 


Now, the basic linear system A^U|^ = f^ 
is equi^^alent to 

(FP^A^P2V^)(FP2U^) = FP^f^ (39) 


X X 


X 

I I 1 I I 1 

EIGENVECTOR 2 (REAL PART) 


or 




= 


(40) 


where := FP 2 AbP^'^F ^ = FP^A^P^^F (since 

F~^ = F). For the case M = 7, it turns out 
that A^ is given by 


0 

0 

0 

0 

0 

1 

-1 


0 

0 

0 

0 

1 

-1 

0 


0 

0 

0 

-1 

-3 

-2 

-2 


0 

0 

-1 

1 

2 

2 

2 


0 

-1 

1 

0 

0 

0 

0 


-1 

1 

0 

0 

0 

0 

0 


"1 

1 

0 

0 

0 

0 

0 

0^ 


(41) 


X 


X 


X 



X 


X 

X 

X 

_l I I I I 

EIGENVECTOR 15 (REAL PART) 


Figure 4.- Eigenvectors of 
after permutation. 


The matrix A^ has the same eigenvalues as the matrix Au, and the eigenvectors of 

X X 

A are the eigenvectors of A|^ premultiplied by FP 2 ; hence the eigenvectors of A 
associated with eigenvalues of small modulus are slowly varying and are thus good 
candidates for accurate transfer to a coarser grid. This whole process might be 
described as a "conditioning" of the eigensystem of A|^, where the goal of the condi- 
tioning is to transform the eigenvectors associated with eigenvalues of small modulus 
to vectors that are smooth when considered as grid functions. 
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X X 


X 


Xu 


i I 

EIGENVECTOR 1 


xXXx 

X 

X 

X 

X 

X 


EIGENVECTOR 2 (REAL PART) 

X 


X V ^ 

X X 


I 1 1 1 1 1 

EIGENVECTOR 15 (REAL PART) 


Figure 5.- Eigenvectors of A|^ 
after permutation and reflection. 


Some numerical experiments were performed 
along the lines of those in section 3- Begin- 
ning with the matrix of (40) the matrix 

T := (I - P^SPA^)G^ was computed for 
M G {7,11} j linear and cubic interpolation and 
restriction operators, and varying relaxation 
procedures. The relaxation procedure (note 
that "smoothing" no longer seems appropriate) 
was taken to be of the form 

k 

= n [I+2(Re h.)A^ + |h.|^(A^)2] 
j=i ^ ^ 

(i.e., the complex Richardson technique was 
used), with k e (0,1, 2, 3}- The interpolation 
and restriction processes at the "boundaries" 
treated the left-hand boundary as a Dirichlet 
boundary and the right-hand boundary as a 
Neumann boundary. This was done because the 
transformed eigenvectors have nearly zero slope 
at their right-hand ends and can be thought of 
as vanishing at their left-hand ends. Some of 
the results are shown in table 2. 

From table 2 we see that when no relaxa- 
tion sweeps are performed on the fine level, 
the process will not converge. In all other 
cases (when using the from (15)), a spec- 

tral radius of 0.54 or less was achievable. 
There seems to be no particular advantage in 
forcing the restriction operator to satisfy 
R^ = c(I^)^. The three h values picked for 
the case k = 3 turned out to be not very 
good choices, as the spectral radius of the 
overall process was greater than for k=2 and 
a different choice of h*s. A better choice of 
h's would have made the spectral radius of the 
overall process for k = 3 less than that for 
k = 2. There seems to be little advantage in 
using cubic interpolation; the spectral radius 
of the overall process does not seem to be 
significantly reduced by the use of cubic 
interpolation. 


5. EXTENSIONS 


In this section, we begin extending the ideas of the previous sections to actu- 
ally solving some (simple) problems. The first problem to look at is the simple one- 
dimensional linear variable-coefficient problem: 
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TABLE 2.- SPECTRAL RADIUS OF TOTAL PROCESS: MODEL NONELLIPTIC PROBLEM 


M k 

Re h 

|h|^ 

Spectral radius of T 

R^ from (15) 

from (33) 

Linear 
7 0 

interpolation 


1 

1 

1 

0 

0.25 

.50 

.53 

1 

-0.05 

.25 

.50 

.49 

2 

0,-0. 05 

0.25,0.30 

.25 

.26 

2 

-0.025,-0.075 

.25, .50 

.31 

.47 

3 

-0.0017,-0.0088,-0.031 

0.268,0.342,0.532 

.40 

.57 

11 0 

- 

- 

1 

1 

1 

0 

0.25 

.50 

.51 

1 

-0.05 

.25 

.54 

.51 

2 

0.-0.05 

0.25,0.30 

.24 

.26 

2 

-0.025,-0.075 

.25, .50 

.24 

.48 

3 

-0.0017,-0.0088,-0.031 

0.268,0.342,0.532 

.40 

.57 

Cubic 
7 0 

interpolation 


1 

1 

1 

0 

0.25 

.48 

.43 

1 

-0.05 

.25 

.54 

.57 

2 

0,-0. 05 

0.25,0.30 

.26 

.34 

2 

-0.025,-0.075 

.25, .50 

.42 

.70 

3 

-0.0017,-0.0088,-0.031 

0.268,0.342,0.532 

.35 

.83 

11 0 

- 

- 

1 

1 

1 

0 

0.25 

.50 

.46 

1 

-0.05 

.25 

.48 

.53 

2 

0,-0. 05 

0.25,0.30 

.24 

.32 

2 

-0.025,-0.075 

.25, .50 

.27 

.59 

3 

-0.0017,-0.0088,-0.031 

0.268,0.342,0.532 

.25 

.63 


[c(x)u]^ = f (x) , 0 < X < 1 

u(0) = go 


(42) 


where we assume c (x) > Cq >0, This is the steady-state version of the advection 
equation + [c(x)u]x = f (x) . Suppose (42) is discretized with second-order cen- 
tered differences. On the right-hand boundary, we again use linear extrapolation, as 
in (36). The system of equations that arises is == f b * where 

r 0 C2 1 


-Cl 0 C3 

0 -C 2 0 c^ 




( 43 ) 


-C^ n 0 

M-2 


c 


M 


^-1 '^M 
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Here Cj := c(jAx) and 

:= 2Ax[f(x^), . • • ,f (xiyj_ 3 ^) ,f (xj^)/2]^ + [c(0)gg,0, . . .,0]^ 

In order for the investigations of the previous section to apply, it is probably 
necessary that whatever matrix we work with be **close" to the matrix of sec- 

tion 4. In an attempt to satisfy this condition, premultiply the system A^u^ = 
by C := Diag(l/c^, . . .,1/cj^) (where Diag denotes a diagonal matrix), giving a 
new equivalent system: 


Now the process to follow is the one outlined in previous sections: replace 

equation (44) by the equivalent system 

(FP2Aj^P2^F~^)(FP2U^) = 

Now let Uq^ be an initial guess; perform one or more relaxation sweeps starting 
with Uq^; form the residual; transfer the error equation to a coarser level; solve 
the error equation on the coarser level, either exactly or via further multilevel 
cycles; transfer the error on the coarser level back to the finer level; update the 
guess on the finer level; and repeat the whole process until some convergence cri- 
terion is satisfied. 


This process was programmed and some results will be presented below. Two 
features of the whole process are important to note. First, in order to carry out 
the relaxation sweeps on the coarser level one must be able to form the matrix-vector 
product A^u^. Instead of explicitly forming the matrix A^ and then computing 
A^u^, what one can do is form the matrix-vector product FP 2 CA^P 2 ”^Fu^ , where each 
matrix-vector product (starting from the right) is easily carried out. For the 
relaxation sweeps the complex Richardson technique can be used; since the matrix CA^ 
is in some sense close to the matrix A]^ of section 4, the step sizes used can be 
based on our knowledge of the eigenvalues of that matrix. 


Secondly, to perform the process beginning on the coarser level, one must be 
able to form A^ (= R^A^I^) times a vector; this can be (inefficiently) done by using 
code to compute A^ times a vector along with code to compute R^ times a vector and 
I^ times a vector. This is very inefficient, because each matrix-vector product on a 
coarse level then requires computing a matrix-vector product on the finest level in 
addition to the work of restriction and interpolation. Since this was intended to be 
a pilot study, such inefficiency was judged acceptable. It is important to be able to 
form the matrix-vector products on the coarser levels efficiently. With regard to 
this, it is interesting and encouraging that with M = 7, linear interpolation given 
by (26), and c(x) = 1, the matrix A^ turns out to be 


ro -1 n 


j. 

2 


-1 


0 


1 


-3 2 0 


which becomes, if it is "unreflected" and "unpermuted," 
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_1 

2 


0 1 

-1 0 


Lo -1 


0 

1 

1, 


which is (up to a scaling factor) the niatrix one would get by writing a difference 
scheme for the same differential equation on a three-point mesh. The same holds for 
larger values of M. Thus the computations with could be performed as if one 

were working on a mesh with (M - l)/2 points. Even in the variable coefficient case 
it may turn out that the matrix can be sufficiently well approximated by a 

matrix that comes from a difference scheme on the coarse mesh that the overall pro- 
ce’ss will still converge. This problem has not yet been investigated. 

In any event, the process was programmed for some test problems; it worked quite 
well, with an average error reduction in the L 2 -norm of about 0.5 to 0.6 per step 
(see table 3 for details) - 


To conclude this section, let us describe how the whole procedure could be 
applied to a nonlinear system of the form 


[A(u)u]^ = f(x,u) , 0 < X < 1 

u(0) = g 


(45) 


In brief, the idea is to apply Newton’s method, solving the linear systems at each 
stage of Newton’s method with a multilevel technique (this has been called the Newton- 
multigrid technique). Suppose the system is discretized using centered differences; 
a nonlinear system of equations of the form 


yi 


tj^(u) = 0 , u = 






A(u 2 >U 2 “ ~ 2Axf 

A(uj)ug - A(uj^)uj^ - 2 Axf(x 2 ,U 2 ) 

A(uM>yM ' 

(46) 


arises. Newton’s method is then 


Uq given; 

D«^(u )Au = -e^(u ) 

•^n ~n ~n 

u , := u + Au for n > 0 

~n+i ~n -n 


(47) 


where is the Jacobian matrix of In this case, the Jacobian matrix will have 

the form 
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TABLE 3.- RESULTS FOR [c(x)u]j^ = f(x), 0 < x < 1, u(0) = gg 


c(x) = 1 , f(x) = 0 , gg = 0 

1. Two levels, 7 grid points on fine level, 1 Richardson sweep on fine 
level; average convergence rate 0.499. 

2. Three levels, 15 grid points on fine level, 1 Richardson sweep on 
fine level. 


No. of Richardson sweeps on level 2 Average convergence rate 


1 

2 

3 

5 

10 

20 


0.591 

.566 

.557 

.551 

.549 

.549 


3. Six levels, 63 points on fine level, number of Richardson sweeps: 
1 , 3 , 3 , 3 , 3 on levels 1 (=fine) ,2,3,4,5 respectively, average convergence 
rate 0.557. 

4. Four levels, 31 grid points on fine level, 1 Richardson sweep on 
fine level. 

No. of sweeps on level 2 No. of sweeps on level 3 Average convergence rate 


1 1 0.603 

1 2 .604 

1 3 .599 

2 1 .574 

2 2 .568 

2 3 .566 

3 1 .561 

3 2 .558 

3 3 .557 


C(X) = 1 + X, f(x) = 1, gg = 1 

Two levels, 7 grid points on fine level, 1 Richardson sweep on fine 
level; average convergence rate 0.519. 

c(x) = 1 + X, f(x) = 2 + 2x, gg = 1 

Two levels, 7 grid points on fine level, 1 Richardson sweep on fine 
level; average convergence rate 0.452. 

All average convergence rates measured as ( H ^2 o II / II II ) ° where norms are 

^2 noms and e^, 623 are the error vectors at steps 1 and 20 , respectively. 
The parameters for the implicit complex Richardson sweeps on level k were 

Re(h) = -0.05’2^ |hj^ = 0.25‘4^ ^ (level 1 = finest level). 
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II 



where is the Jacobian matrix of A(ui)ui and is the Jacobian matrix of f. 

To solve the linear system given above by multilevel techniques one would, following 
along the lines of the previous sections, premultiply the system by 

C := Diag(J7\j^\ . . 

and then use the previous ideas in their extensions from scalar to **block" form; that 
is, all permutations would be done blockwise, etc. The whole process was programmed 
and applied to a test problem of smooth supersonic expansion around a corner (the 
Prandtl-Meyer problem). The method worked; Newton's method converged nicely (the 
analytical solution of the problem is known, so the initial guess could be chosen 
fairly close to the exact solution of the continuous problem), and the multilevel 
procedure at each stage of Newton's method also converged adequately. The next prob- 
lem to investigate is one of subsonic flow. 

In conclusion, an attempt has been made to give a framework for the analysis of 
multilevel methods that is sufficiently general to embrace both elliptic and non- 
elliptic problems. The key ingredients are the relaxation process, the interpolation 
and restriction processes, and their relation to eigenvectors of the matrix of the 
linear system. Emphasis has been on the smoothness of the eigenvectors associated 
with small eigenvalues. What would be desirable would be a way to precondition the 
linear system so that the small eigenvectors are smooth on the given grid. Then 
natural restriction and interpolation processes should work well. For some one- 
dimensional problems, such a preconditioning has been given. A basic problem is to 
find such a preconditioning for problems in more than one space dimension, for it is 
only in higher dimensions that the full power of multilevel techniques can make itself 
felt . 
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ADVANTAGES OF MULTI-GRID METHODS FOR CERTIFYING 
THE ACCURACY OF PDE MODELING* 


C. K. Forester 

Boeing Military Airplane Company 


SUMMARY 


Application of computer-aided analysis techniques for modelling par- 
tial differential equations (PDE) requires specification of the boundary 
conditions, initial conditions, modelling error criteria and error toler- 
ances that relate realistically to the physical problem of interest. To be 
valid, the analysis must feature numerical techniques for assessing and 
certifying the accuracy of the modeling of the PDE to the user's specifica- 
tions. Examples of the certification process with conventional techniques 
(reference 1-15) are summarized for the 3-D steady full-potential and the 
2-D steady Navier-Stokes equations using fixed grid methods (FG). The 
advantages of the Full Approximation Storage (FAS) scheme of the multi-grid 
(MG) technique of A. Brandt (reference 16-19) compared with the conven- 
tional certification process of modeling PDE are illustrated in 1-D with the 
transformed potential equation. Inferences are drawn for how MG will 
improve the certification process of the numerical modeling of 2-D and 3-D 
PDE systems. Elements of the error assessment process that are common to FG 
and MG include 

1 . generating physical domain trial grids that are useful for esti- 
mating the contamination of the results by residual and 
truncation errors . 

2. assessing the contamination of selected trial grid solutions by 
the nature of the solution process (residual error effects), 

3. assessing the contamination of selected trial grid solutions by 
the nature of the choice of the grid (truncation error effects), 

4. adjust the grid until the allowable error bounds are satisfied. 
Error norms suitable to the application are an implied 
requirement . 


* This work is performed under NASA Langley Research Center Contract NAS1- 
16408. The contract monitor is Phil Drummond who is associated with the 
Hypersonic Engine and Computational Methods Branches. 
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CONVENTIONAL OIRTIFI CATION PROCESS 


Numerical error assessment with conventional PDE modelling techniques 
(references 1-14) includes the following ingredients. A solution of finite 
difference equations (simultaneous system of algebraic equations) for a 
specific discretization of the analysis domain is generated for different 
choices of grid density and grid distribution in the analysis domain. It is 
common to use a sequence of grids of the same grid distribution that differ 
in grid count in each independent variable direction by factors of two — 2, 
4, 8, 16, 32, etc. The coarser grids can be generated by deleting every 
other point of the finer grids. The effects of the choice of grid distribu- 
tion are examined by choosing sequences of grids which have different mesh 
distribution. The data from all of these solutions of the grid-related 
equations is organized by constructing an error difference table. Solution 
differences are posted in order of the coarse-to-fine grids for each grid 
sequence. The solution differences are generated by subtracting the values 
of adjoining pairs of grid solutions of the dependent variables at all 
physical locations in the analysis domain that correspond to the grid coor- 
dinates of a grid of a selected intermediate density. Interpolation is used 
to relate other grid solutions to these selected grid coordinates. As the 
grid density increases the differences should decay approximately according 
to the formal order of accuracy for some selected mesh distribution. If 
this occurs, extrapolation to solutions to infinite grid density may be 
well-behaved and reliable estimates of the maximum global error on the 
finest grid may result. The preceding process appears to work best on the 
modelling of parabolic and elliptic equations in smooth domains with smooth 
boundary conditions. For mixed elliptic/hyperbolic systems erratic results 
may occur due to unresolved singularity regions and/or poor residual error 
control . 

A key aspect of the previous description is that grid adjustments are 
made in some pattern that tends toward a limiting grid configuration. A way 
to think about this is to define a goal-oriented reference grid ('goal 
grid') to which the initially selected grid sequences must evolve. The 
'goal grid' serves as the host upon which the solution will be known to some 
resulting error bound. The goal is that this error bound will be within the 
accuracy desired by the analysis process. It should be understood that the 
'goal grid' may not be exactly unique in pattern because of grid initializa- 
tion, grid generator, and grid-equation solver properties. It is assumed 
that adequate control of the residual error effect have been observed in the 
process of assessing the truncation error effect. This is done by develop- 
ing a sequence of several solutions on each grid choice with various choices 
of constraints on the residual toleraces that are used to terminate the 
computations for each solution on that grid. 

Conventional techniques for developing the data that is necessary to 
certify the accuracy of numerical modeling procedures are limited by five 
factors. 
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1 . Adequate control of residual-error effects are costly in computer 
time (computer resource intensive). 

2. Because of (1) above, very limited numbers of solutions are 

available which makes for very spare information from which error 
estimates can be constructed. 

3. Because suitable grid adjustment to control the error within 

desired bounds is cumbersome or impractical, arriving at proper 
grid configurations in mesh density and distribution is often 
very difficult or impractical. 

4. Numerical error during grid refinement may be erratic, not 

monotonic. Confusion as to the grid adjustment needs can result. 

5. The computer program machinery is usually not available for con- 

veniently constructing the error table. This means that the 
error assessment process is manpower intensive. These factors 
discourage careful, complete development of the ’goal grid’ 

solution. Without this, the accuracy of the result is unknown; 
the meaning of the result is undefined and useless. 


MATHEMATICAL DESCRIPTION 


Let LU = 0 (1) 
represent the PDE system of interest. 

In discretized operator notation equation (1) is 

L^U^ = 1^-B^ (2) 

where is the local truncation error and is the local residual error 
for each cell of the analysis domain. The grid structure index, I, is 
related to choices of maximum indicies of independent variables (K,L,M) and 
grid density distributions for each selected trial grid. I is defined as 
the ’goal grid’ index. ® 

An ideal or perfect difference scheme for (2) is one in which the local 
truncation error does not contaminate the decoded variables of interest 
such as velocity, density, pressure, etc. Only the residual errors impact 
these variables. Thus, the user specifies exactly the locations in the 
geometry at which values of these variables are desired. With residual 
error control within adequate bounds, the accuracy of the result is insured 
within selected limits. 

Nonideal difference schemes are defined as those in which the local 
truncation error and residual errors simultaneously influence the value of 
the decoded variables. Except for specialized difference schemes for model 
problems, difference schemes for conventional applied analysis are noni- 
deal. Conventional steady state numerical modeling of the full-potential 
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equation, the Euler equations, and Navier-Stokes equations in two-and 
three-space dimensions are subject to both of these problems with the excep- 
tion of academic cases. Conventional techniques (references 11-14) primar- 
ily address the means of efficiently controlling the residual errors rather 
than the truncation errors. In the present account concern for the control 
of both error sources is present, but the emphasis is upon the approach for 
controlling the truncation error. This subject is closely related to the 
problem of proper grid adjustment from an initial state to the 'goal grid' 
state with proper residual control during the grid-adjustment process. 


TRUNCATION (GRID RELATED) ERRORS 

The local truncation error is formally defined as the magnitude that 
the left-hand side of (2) yields for each cell when the 'goal grid' solution 
is interpolated (restricted) to any other of the trial grids. It is 
targeted at zero in conventional representations of (2) for all values of I. 
The form of MG of concern in the present account has the local truncation 
error targeted at zero only for the Ig grid. The local residual is computed 
by rearranging (2) and solving for Rj. A perfect computer solution of (2) 
renders Rj equal to zero to within round-off errors for all values of I for 
EG and MG approaches. Rj is targeted at zero for all values of I in FG and 
MG approaches. Fortunately the MG solution process does not require any 
knowledge of the 'goal grid' solution in order to generate useful estimates 
of the local truncation error. It is a deferred correction process in which 
the relative local truncation error estimates between grid pairs are cor- 
rected as solutions on the finer grid levels become available. These 
corrections are not major at the coarser levels as the finer and finer grid 
levels evolve. This is one of the powerful aspects of MG. 

To clarify the nature of the grid structure index for conventional 
conformal analysis, the nomenclature I equals IGjg is introduced where IG 
refers to the grid density and JG refers to the grid configuration. For 
example, nonorthogonal conformal -grid analysis methods require the user to 
'choose the stretch factor related parameters that control the physical 
domain mesh interval rate changes in length and twist for the various 
independent variable directions 1?, )?, CT. One choice of these parameters 
coincides with a value of JG. The selection of the number of grid cells in 
the f , )? , (T directions relates to the selected value of IG for a selected 
value of JG. The maximum values of K,L,M are therefore defined by IG for 
conventional computer program index controls on the problem size. 
Symbolically these notions can be expressed as 


IG = IG (n^Sf, n^(T) 


* This definition is incomplete for composite grids which feature grid 
nesting, grid overlays or coupled conformal regions with discontinuous 
interfaces in the logic or transformed space. 
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(n^, n^j n^) = no, of grid intervals in , (T directions 

K = n, , L = n^, M = n_ 

max 1 ’ max 2 ’ max 3 

JG = JG (0,P,Q) 

0»P»Q = physical space grid compression functions along ^ (T 

directions 

Figure 1 shows a typical distorted hexahedral type of computational 
cell. Hybred finite difference/finite element analysis techniques employ 
ordered arrays of these cells. The resulting aggregate of cells can be 
viewed as nonothogonal conformal-type analysis grids. A research V/STOL 
inlet photograph is shown in figure 2. Front and elevation views sketches 
of this inlet are shown in figure 3« A 3-D grid of this inlet has been 
generated for full-potential flow analysis (reference 10). Figure 4a shows 
a longitudinal slice of this grid through the crown and keel lines of the 
inlet. Figure 4b shows typical grid detail near the hilite of the inlet. 

Conventional certification of the accuracy of full-potential flow 
analysis with these type of grids has been performed (references 7, 10) in 
which variations in IG and JG were made to develop the ’goal grid’ shape. 
Additionally direct experience with the conventional certification process 
of the accuracy of Navier-Stokes analysis with conformal grid techniques 
(references 2-4, 6, 9) and with composite conformal grid techniques 

(reference 4) has been accumulated. This experience has lead to an under- 
standing of practical problems of defining where grid adjustment is needed 
and how to make the proper grid adjustments with conventional techniques in 
order to obtain the required ’goal grid’ solutions. The conclusion is that 
vast improvements on the conventional techniques are needed. The critical 
areas that require improvements for efficient application of surface grid 
(panel methods) and field grid methods include developing 

1 . practical error monitors and practical error bounds that assist 
the grid adjustment processes efficiently, 

2. discretized analysis formulations that permit more grid 
flexibility attendent with reduced grid-adjustment complexity, 

3. more efficient and flexible approaches to reducing residual 
errors, 

4. processes for coupling the grid adjustment and error monitor 
together so that the bulk of the computational effort is directed 
to the ’goal grid’ configuration. In other words, develop 
schemes that minimize the effort expended on trial grids in which 
the truncation errors are out of bounds. This is the goal for the 
most cost effective applied analysis methods. Many intermediate 
steps toward this goal are required. 
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In order to begin the development of MG error assessment technology, 
model problems with exact solutions are being studied. The results of some 
of this work are discussed in the next section for the mass conservative 
transformed potential equation. Boundary conditions of the exact velocity 
are used to set the gradient of the velocity potential at the grid entrance 
and exit cell faces. Iterative adjustment of boundary velocity potential is 
used at every relaxation sweep. The analytical velocity solution is a 
function of the ratio of channel entrance area to the channel cross section 
at any other station of interest. The channel area variation with station 
position is defined by cubic functions. 


RESULTS OF I-D ERROR ASSESSMENT STUDY 
FOR 

STEADY INCOMPRESSIBLE FLOW 

The discretized 3-D full-potential equation is restricted to a 1-D 
analysis tool (by deleting the L and M indicies). The total velocity can be 
computed in many ways. One formulation (reference 7) yields values of the 
total velocity in which the truncation errors in the velocity potential do 
not contaminate the computation of the total velocity. This formulation is 
used to illustrate uses for the local truncation error estimates. Various 
properties of MG process are illustrated also. A 1-D incompressible flow 
problem for which analytical solutions are readily available is employed 
for error assessment. An adaptively gridded test case is presented and its 
Implications are discussed. 

The 1-D test problem involves an analytical geometry of a straight 
channel with a cubic function for a constriction that reverts either 
abruptly step-wise or smoothly to a straight channel. Figure 5 shows the 
channel section shape distribution with respect to the flow direction. 
Figure 6 shows the analytical solution restricted to 65 grid coordinates (64 
cells) with the grid intervals constant. Eleven trial fine-grid sets (JGjnax 
= 11) were used to examine the 1-D potential solution properties for a) 
grid with uniform mesh intervals, b) grid with uniform mesh intervals in the 
region of cross sectional area variation but with a stretch factor of two in 
the straight sections, and c) grids with uniform mesh intervals in the 
straight sections but with a stretch factor of .80, .85, .90, .95, 1.0, 
1.05, 1.1, 1.15, and 1.2 in the constricted region where the finest grid is 
near the abrupt enlargement of the channel cross sectional area for stretch 
factors less than unity. The total number of grid intervals for each set of 
trial grids are 4, 8, 16, 32, and 64 (IGmax =5), where the number of grid 
intervals in the constriction region are respectively 2, 4, 8, 16, and 32. 
FG and MG methods have been applied to generate solutions for these sets of 
trial grids. The general character of these solutions is shown in Figure 6 
by the solid line for the finest grid. Also shown in Figure 6 is the FG and 
MG solutions with very nonstringent residual error tolerances. Using point 
relaxation and sweeping the grid in the flow direction, MG yields a maximum 
global error of less than 4$ in the equivalent of twenty-five sweeps of the 
64 node grid whereas FG requires over one thousand sweeps of the 64 node 


28 



grid to achieve the same accuracy. The maximum error occurs at the geo- 
metric discontinuity. Increasing the accuracy by an order of magnitude 
requires less than a factor of three increase in the work for the MG and the 
FG. The process of solving the problem to greater accuracy can be continued 
until the maximum global error satisfies desired constraints up to round- 
off error effects. The boundary conditions are imposed both on the FG and 
MG as set mass rates of equal magnitude at the entrance and exit cross 
sections. 

Control of the contamination of the total velocity output is cor- 
related with the computer work expended in solving the grid equations. The 
data shows that the residual error control efficiency increasingly favors 
MG over FG as the number of grid points is increased. This result is in 
keeping with Brandt's (references 16-19) results. For a simple elliptic 
problem, this result establishes one type advantage of MG over FG proce- 
dures: MG is asymototically more efficient than the FG strategy in con- 
trolling residual error. Hence the number of grid points that can be 
considered in an analysis with MG is greater than FG for a given computer 
budget. The inference of this advantage is summarized as: the potential 
for control of truncation error is greater with MG than FG strategy for 
nonideal difference schemes. 

Residual errors and maximum global errors are observed to be directly 
linked. This can be examined by computing the discrete continuity balance 
(local mass balance) on each cell. By dividing the local mass balance by 
the local channel cross sectional area, a delta velocity results which if 
added to the local velocity is the correction necessary to remove the local 
residual error. The maximum global error is reduced to round-off error 
(below ten to the minus ten) when the residual velocity correction is 
applied successively from the entrance region point-by-point through the 
grid to the exit region. Alternatively the maximum global error can be 
computed directly from the sum of the residuals of the same sign divided by 
the channel cross section at which the sign in the residual changes. The 
channel entrance area has been set to unity. 

The form of MG that is used for the computations involves a nonzero 
right-hand side term. With this formulation the discretized continuity 
equation is viewed as having a mass source right-hand side term which is 
constructed from the estimate of the local truncation error. Fine grid 
velocity potential data are interpolated (restricted) to coarse-grid con- 
tinuity balances to obtain estimates of the local truncation error where 
global integral is zero for mass conservation. Total velocity output that 
is decoded from solutions of these coarse-grid Poisson-type equations are 
not directly useful (with an academic exception) . This is a key point about 
MG output: the total velocity output on coarsest grids are useless in 
themselves. This point is illustrated in Figure 7 for three grid levels. 
Note that the results near the geometric discontinuity are always badly in 
error. In the coarsest grid the local truncation error from the geometric 
discontinuity contaminates the total velocities at three cell faces where 
the solution is developed. The extent of the contamination is reduced 
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dramatically as the grid is refined but it is only eliminated on the finest 
grid level where it is exactly zero by choice. Any other choice for the 
finest grid solution would generate worse results than that shown; the 
maximum global error would be larger near the discontinuity than occurs in 
the present example. Therefore the truncation error extrapolation, 
iT-extrapolation, (ref. 15) cannot be inserted at the finest grid level, only 
at next to the finest grid levels. As shown in reference 19, it can be used 
as a method for accelerating solution convergence or for generating still 
finer grid solutions (finer than 64 cell cases in the present example) at 
lower cost. Alternatively, a finest grid selection of 32 cells could be 
used with tf-extrapolation to get the solution that is shown in Figure 6. 


Figure 8 shows the truncation error spectrum for the peak values of the 
local truncation error asymptotically approach nearly the same values 
including ‘t'-extrapolation on the next-to-the-finest grid solution. The 
magnitude of these terms are substantial near the discontinuity and, 
because they form the right-hand side of the cell-wise flux balance equa- 
tions, induce large errors in the total velocity profiles that are shown in 
Figure 7. The coarse-to-fine grid correction equation of Brandt very 
effectively interpolates the Poisson type solutions on coarser grids so 
that the coarser grid solutions mimic the finer grid solutions. Standard 
interpolation, cannot account on the next finer grid, 1+1, 
for the fact that the right-hand side term is significant in the coarser 
grid solutions. For this reason standard interpolation is not useful and 
must be replaced by a more elaborate interpolation. Brandt recommends 
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where . is the fine-to-coarse grid interpolation operator and is the 
coarse-to-fine grid interpolation operator. This expression functions well 
as illustrated in Figure 9. Linear interpolation is used for these opera- 
tors with weightings of 1/4 and 3/4 for and weightings of 1/2 and 1/2 
for l|+i • No modification of these weights is used for stretched grid cases 
whose principle effect is to retard the convergence rate by up to about one- 
third for cases with stretch factors of .80 and 1.2. 


In the following discussion uses of the local truncation error esti- 
mates for grid adjustment are discussed. A simple example of semi-adaptive 
grid refinement is shown in Figure 10 in which grid compression toward the 
region of high local truncation error is used. Iterative grid compression 
is continued until a condition of the maximum normalized local truncation 
error is less than .08. Semi-adaptive grid compression is implemented in 
the interval 0 f; z/L ^ 1 by iteratively decrementing the grid stretch factor 
from an initial value of 1.2 in steps of .05. As expected no satisfaction 
of the tolerance on the maximum local truncation error is found as long as 
an exact step-wise discontinuity is enforced at a z/L equal to unity. With 
a cubic transition function in the interval 31/32 ^ z/L A 32/32 which has 
slope continuity with the remaining channel geometry, local truncation 
error reduction results with grid refinement. Figure 10 shows the results 
of the analytical solution and solution with a grid contracted toward z/L 
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equal to unity. Over an order of magnitude reduction in the local trunca- 
tion is readily achieved with a contraction ratio of .85. Obviously the 
selection of Zmax *08 as the criteria for stopping the computation is 
arbitrary. The lower the magnitude selected for the stopping criteria the 
more grid is compressed into the region of the abrupt geometry change. 
Eventually this approach starves the remaining domain of the analysis of 
sufficient mesh to satisfy the selected maximum local truncation error 
tolerance. Therefore a preferred strategy involves sub-dividing the region 
of small length scale, 31/32 A z/L 32/32, with a uniform grid of varying 
number of grid points. It is easy to implement. It is regarded also as 
semi-adaptive. A 'fully' adaptive strategy requires labeling each cell of a 
trial grid with a special flag that designates cells with a local truncation 
error that exceeds a selected threshold value. Cells so flagged may be sub- 
divided by nesting compressed grids or by uniform interval grid embedding. 
'Fully' adaptive MG strategy only requires that iterative work to reduce the 
truncation error be applied to the flagged cells. This approach may be more 
efficient, 'fully' adaptive and more computer programming intensive than 
the semi-adaptive strategies. This approach appears to be practical to 
program for machine computations. 

It is clear in the preceding simple problem that the local truncation 
error estimates indicate the proper region in which grid adjustment (mesh 
density or distribution) should occur or the proper region in which the 
geometric representation of the boundary of the analysis domain may need 
modifications. It is expected that shock wave or unresolved shear layer 
singularities would likewise produce normalized local truncation error 
estimates of the order of unity. A list of six causes of large truncation 
error includes 

1 . mesh density 

2. mesh distribution 

3. shock singularity 

4 . unresolved shear layer singularity 

5. boundary condition discontinuity 

6. improperly controlled residual errors 

It is certain that the local truncation error estimates in themselves 
cannot distinguish among these six causes of large local error or whether 
the desired results of the analysis output are adversely affected. 
Therefore additional information must be associated with the local trunca- 
tion error estimates to make them useful in certifying the accuracy of a 
numerical PDE modeling. For this purpose, error norms must be developed 
that assist identifying the 'goal grid' solution. Adaptive grid computa- 
tions are defined as those that utilize, in an automated fashion, a link 
between the error norms and an adjustment of the analysis strategy. This is 
the essence of MLAT-FAS MG of reference ^9• It is obvious that such an 
approach is designed with the problem of certifying the accuracy of the PDE 
modeling in mind. FG technology simply does not connect the error 
assessment difficulties with the solution algorithm design. 
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SUMMARY OF THE I-D INCOMPRESSIBLE FLOW STUDY RESULTS 


1 . The FAS-MG process accelerated the reduction of the residual 
errors in a manner which increasingly favors MG over FG as the 
number of grid points is increased. 

2. The FAS-MG process is straight-forward to implement with standard 
FG grid-equation solution processes. Provided the FG solution 
process is convergent, FAS-MG is also convergent. 

3. Sums of the same-sign residual error are analytically related to 
the maximum global error whether or not a geometric discontinuity 
exists. 

4. Standard grid-equation formulations are modified for FAS-MG by 
the inclusion of an additional right-hand side term. This term is 
related to the local truncation error. 

5. Solutions of the grid equations on all but the finest grid cannot 
be used directly for estimating the PDE solution. This is due to 
the non-zero right-hand side term of the coarse grid solutions 
for which proper account must be made before the coarse grid 
solutions are used. 

6. Standard interpolation fails to be useful for prolongating 
coarser grid MG solutions to finer grid levels. Brandt's FAS-MG 
formula is effective for this purpose. 

7. Estimates of the local truncation error are a direct consequence 
of the FAS-MG process. 

8. The sign of the local truncation error oscillates at the highest 
possible frequency of two mesh intervals for an ideal difference 
scheme. This produces a cancellation of the local truncation 
error in the solution. Control of residual errors are all 
important for satisfying desired global error bounds, local 
truncation error is of no consequence in an ideal difference 
scheme. 

9. Regions of the trial-grid solution in which the normalized local 
truncation error is of the order of unity are indicative of some 
potential problem with the analysis. 

10. It is conjectured that nonideal difference schemes will exhibit 
two-mesh-interval sign oscillations in the local truncation error 
estimates only at singularities or at locations which have grid- 
related problems. Otherwise the local truncation error estimates 
will persist at longer wavelengths. 
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. It is Gonjeotured that sums of the same-sign local truncation 
errors are significant to estimating the maximum global error for 
nonideal difference schemes. Useful sums may or may not include 
the regions of large local truncation error depending on the 
purpose for the error norm. 


OBSERVATIONS AND RECOMMENDATIONS 


No simplifying assumptions have been made in the use of the Brandt 
FAS-MG scheme and it is expected to be generally applicable to and straight- 
forward to apply to existing 2-D and 3-D codes. The following comments are 
supplied from that viewpoint. 

As a practitioner of conventional applied analysis techniques for 
modeling PDE systems, the following goals of future work appear to be 
desirable. 

1 . Modify conventional applied analysis codes with the Brandt FAS 
scheme so that local truncation error estimates are a routine 
output. This will aid in quickly identifying regions of the 
analysis domain where one or more of six large truncation error 
problems exist. Concern over full MG optimality is not the issue 
for the short term, primarily it is desirable to reduce the labor 
involved in determining where in an analysis serious potential 
numerical error problems reside. 

2. Develop error norms that properly exploit the local truncation 
error estimates of MG so that conventional, semi-adaptive and 
adaptive composite grid technology can achieve high efficiency in 
the PDE modeling certification process. 

3. To be effective the grid generation process and the grid-equation 
solution process must be drawn together. Composite grid 
technology in the context of the PDE modeling certification 
process should be encouraged. Composite grids refer to the 
broadest definition of grid design, coupled conformal grids in 
which nested grids or grid overlays are permitted by the analysis 
approach. 

4. It is customary to compare conventional analysis results with 
experimental data for validation. This practice should 
eventually yield to the more precise requirement that the PDE 
modelling error assessment and numerical accuracy certification 
should be an independent function of high standing. The 
comparison with experimental data could then assume the proper 
role of checking that the PDE system is appropriate to the goals 
of the analysis application. Such a practice will offer the 
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advantage that the PDE modelling errors will be distinct from the 
PDE formulation errors. This advantage is not commonly 
exploited. 
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Figure 2. Asymmetric V/STOL Research Inlet 
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Figure 5. One-dimensional Incompressible Flow Channel 
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Figure 6. Residual Error Effects on Fixed Grid and Multigrid Solutions 

41 



LENGTH RATIO -- Z/L 


ifects on Intermediate Grid Level Solutions 



NORMALIZED TRUNCATION ERROR 


2 Cells on contraction 
4 Cells on contraction 
8 Cells on contraction 
16 Cells on contraction 


\ 

□_OrfD- - — O • V-O ' 




•o — a- -f 

D 


.82.82 , 
O DA'* 



.86 ( T extrapolation) 
T =0 finest grid) 




CHANNEL LENGTH RATIO 


y.62 X-.44 


Figure 8. Truncation Error Spectrum as a Function of Grid Density 
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Figure 10, Adaptive Grid Control to Selected Truncation Error Toleran 
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SUMMARY 


Multigrid algorithms based on the weighted mean scheme are developed for 
the solution of the two-dimensional incompressible Navier-Stokes equations. 
They are applied to two typical problems encountered in engineering applica- 
tions, namely, the convection-diffusion problem of the Benard convection cell, 
and the driven cavity problem. An analysis of the smoothing rates and stabi- 
lity is given. The efficiency of the multigrid method is investigated. 


The Governing Equations and Solution Technique 


The two-dimensional steady-state incompressible Navier-Stokes equations 
are considered with vorticity, stream function and temperature as the depen- 
dent variables. They are written in the convective form: 
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- V^4> 
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( 1 ) 


where (}) represents appropriately nondimens ionalized vorticity C or tem- 
perature T, and a correspondingly represents the Reynolds number or the 
Peclet number; u and v are the nondimens ional velocity components in the 
X- and y- directions. The stream function tp, of course, satisfies the 
Poisson equation 

'7^ ij) = -e. ' (2) 

As a simple test case, we consider the "Benard cell" problem, where we solve 
for the temperature T, with a given velocity field 

u = -cos(y)sin(x) , 

V = cos(x)sin(y) , 0 ^ x,y < 7T 

Research of LRL and QH was supported by NASA Contract No. NASl-16572; research 
of MYH was supported by NASA Contract No. NASl-16394. 
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and boundary conditions 

T = 0 on y = TT , 

T = 1 on y = 0 , 

T = 0onx=0,7T 

X 


The Numerical Method 


We use the weighted mean scheme for the finite-difference approximation 
to Equation (1). The concept of the weighted mean scheme appears to be 
originally due to Allen and Southwell (ref. 1) and it has been rediscovered 
several times since then. 

As recently pointed out by Gresho and Lee (ref. 2) this scheme was 
rediscovered by Spalding (ref. 3) by a more intuitive approach. Raithby and 
Torrance found the scheme in the two-dimensional problems they considered 
particularly when the grid line and velocity direction were closely aligned. 
Later, Raithby (ref. 4) appears to have improved upon it for cases where 
the flow was not aligned closely with one of the coordinate lines. Fiadeiro 
and Veronis found the method again, called it the weighted mean scheme, and 
generalized it to two- and three-dimensions. The history of this scheme in 
the finite element literature is briefly reviewed in reference 2. All such 
schemes become identical in the case of the one-dimensional steady-state 
advect ion-diffusion equation with constant coefficients: they tend to the 

pure central difference scheme for small Reynolds number (or Peclet number) , 
and to the pure upwind scheme for large Reynolds number (or Peclet number) ; 
the numerical solution of the discretized equation agrees identically at the 
nodes with the exact solution, and thus resolves boundary layers. 

Following Fiadeiro and Veronis (ref. 5), we discretize Equation (1) as 
follows: 


ij 1,3 


(3) 


For uniform grid, the coefficients C, N, S, E, and W are defined in terms 
of the local velocity components: 
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and C = -N-E-S-W. 


Extension to a nonuniform grid is straightforward. Analysis and the numerical 
results show that the method is second order accurate. It is easily estab- 
lished that the coefficient matrix is diagonally dominant and positive 
definite, and hence any reasonable iterative method will be stable and 
convergent on any grid irrespective of the grid Reynolds number (or grid 
Peclet number). However, the solution thus obtained on a coarse grid may 
involve gross inaccuracies owing to discretization errors. But in a multi- 
grid context the coarse grids are used only for smoothing the high frequency 
component of the residual on the fine grid. Specifically, we use the cor- 
rection scheme algorithm, and generalize it to the full approximation al- 
gorithm in the terminology of Brandt (ref. 3). Gauss-Seidel relaxation 
or successive line relaxation is used to reduce the high frequency errors. 

A quantitative measure of the relaxation efficiency is the smoothing 
rate defined to be the eigenvalue, largest in magnitude, of the relaxation 
amplification matrix for high frequency components. To evaluate the 
smoothing rate, the usual local analysis is employed by assuming 


= exp [i (kC + In)] (5) 

and applying the relaxation scheme. The new values of ((), . will have amplitudes 
different from unity. Ideally, the amplitudes of the high frequency modes 
should be reduced. This provides the definition of the asymptotic smoothing 
rate y 


y = Max 

J < ?,n <7r 


*'kl 


new 


( 6 ) 


Consider now the Gauss-Seidel relaxation 


^old 
\l \l 


\i 


”ki ♦ki '^ki \i 


= 0 


(7) 


Using (5) in (7) we obtain the as 3 nnptotic smoothing rate 
relaxation 


y for Gauss-Seidel 
gs 


y 


gs 


Max 


N e^^ + E e^^ 

C + S + W e“^^ 


( 8 ) 
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Obviously, the relaxation becomes inefficient if N » S and E » W. 

Examining explicitly the relations (4), it is found that N » S if v < 0 and 
|v|>l/(aAx). The inefficiency is remedied by performing the relaxation sweep 
in the direction of y decreasing when v < 0, i.e., following the flow. This 
is validated by the numerical results for the Benard cell problem summarized 
in Table 1 and the smoothing factors presented in Diagram 1. For column 
relaxation with increasing x, the smoothing rate is found to be 

E 

N + S + W + C 

2 ^ 

The Diagram 2 shows the region where each of N, E, S, W dominates the rest. 
Thus, one expects y ~ 1 in the upper region, and this is verified in 
Diagram 3. When two relaxations are performed one with x increasing and 
the other with x decreasing, the errors are effectively reduced all over 
the region. In fact, alternating row and column relaxation with changing 
directions is quite efficient, reducing error by 20-30%. Table 2 presents 
a comparison of SLOR and multigrid methods for the Benard cell problem. 



TABLE 1. Efficiency of correction scheme (Benard cell problem) 


Pe 

Tol. 

Work for plain 
Gauss Seidel 

Work following 
flow 

10 

.01 

17 

22 

50 

.01 

45 

24 

400 

.005 

200 

172 

400 

.001 

313 

208 

TABLE 2. 

Comparison of SLOR and multigrid methods (Benard 

cell problem) 

u 



MULTIGRID 


SLOR 

Point relaxation 

Line relaxation 

Peclet No. 

iterations* 

WU** 

WU*** 

10 

85 

38 

20 

20 

no 

35 

27 

40 

140 

44 

38 

80 

205 

83 

51 

160 

- 

101 

96 


GRID 25 X 25 

* one SLOR iteration is a sweep in x direction, followed by a sweep in 
y direction. 

one WU is one sweep over whole field, following flow. 

*** one WU is one sweep in the x increasing direction, followed by one sweep 
with X decreasing. 50 





The Full Approximation Scheme (FAS) 


The FAS is not really required for solutions of linear problems, but just 
as a test case it was applied to the Benard cell problem. Speed-ups of 100% 
or so were obtained when the relaxation was performed following the flow. 

Table 3 summarizes the results for this problem. The calculations were done 
on a finest grid of 25x25 with 4 levels, using T = y/7T as the initial data. 


TABLE 3. Acceleration of FAS convergence (relaxation following flow) 


Max. abs. 
residual 

Peclet 
no . 

Iteration 

WU 

Iteration 
following flow 

.001 

10 

48 

36 


20 

50 

31 


40 

65 

36 


80 

136 

61 


160 

211 

98 

.0001 

10 

59 

47 


20 

66 

41 


40 

84 

47 


80 

176 

83 


100 

208 

64 


160 

252 

127 


Instability 

There is a trade-off in the multigrid method between the accelerated 
convergence usually obtained and the possible divergence of the process. In 
fact, simple one-level relaxation procedures for elliptic problems are always 
convergent — and the weighted means scheme extends this property to con- 
vection-diffusion problems. However, the multigrid algorithm introduces 
the possibility of divergence, which is actually met in practice. We mention 
here briefly some of our experience with this phenomenon. 

Even in a linear problem, divergence may occur. In the Benard cell 
problem, at Pe = 400 the algorithm became ’’trapped” in the two coarsest 
grids. This implies that the relaxation on the coarsest grid was always 
convergent (considering the convergence criteria of the algorithm) , while 
the relaxation on the next coarsest grid was never efficient. Eventually 
the calculation diverged to infinity. This was easily remedied by enforcing 
a few extra iterations on every grid — even when the algorithm would consider 
them inefficient — or by merely deleting the troublesome coarsest grid. The 
linear instability served, however, as a warning about possible complications 
in nonlinear cases. 
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For the driven cavity problem we implemented a check to guard against 
divergence. Namely, we introduced apriori bounds for the speeds 

,i|j . All these bounds are easily computed using the maximum principle 
(see^Appendix A). When such a bound is violated, there is clear indication 
of divergence, which we treated by returning to the finer grid, without 
interpolation from the coarser "incorrect” level. At this point we also 
changed from the correction scheme (CS) to the full approximation scheme 
(FAS) , in order to be able to check the solution at any level and ensure it 
stays within bounds. 

We also tried a more theoretical investigation of the divergence 
phenomenon. An elliptic problem — and even a convection diffusion problem 
with the weighted mean scheme — produces a relaxation formula with positive 
weights. Thus, repeated relaxation at any level is convergent, and roundoff 
errors are not magnified. The only point where divergence may evolve is the 
level change. Since we used linear interpolation from coarse to fine grids — 
again positive weights — it is just the fine to coarse level change which may 
cause instability. 


Consider solving a homogeneous problem, using a two-grid scheme. We 
shall identify the levels by subscripts c and f for coarse and fine grid 
respectively. Let the discretized operators be L , L , and the relaxation 
operators R , R . Also, denote by ib the transfer of data from level a to 
b. Suppose^ that m relaxations are performed on the fine level, and n 
relaxations on the coarse level. Then, an initial value u^ becomes: 


new _f ,^c _ sU _-l _c ^ v „m 

“f ■ ‘'f - "-"c’ '■c '■f’ "f “f 


The operator appearing on the right hand side should have a spectral radius 
less than one to ensure convergence. It is obvious that by taking m large 
enough, this will be achieved, since 

||r^||<i (also |1r^||<i, MiflhD- 

But this means performing more relaxations on the fine grid, thereby losing 
efficiency. 

Another possibility is to have the operator 
- (I - L^) 

small, in some suitable sense. If n-^, i.e., coarse grid relaxations are 
repeated to convergence, this expression may be simplified to: 

If - If If If = (I If - If If) 

f cfff c cf ff 

Now, L“^ is bounded, but the bracketed term is usually large. It is true 
that involves mainly the high frequencies (where L and differ 
significantly) and thus will**behave‘ for properly smoothed data, but smoothing 
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involves more applications of R^, resulting again in loss of efficiency*. 

These results depend on the detailed expressions for L^. We use 

injection and linear interpolation for — this preserves^the positive 
weights property, and a high order of accuracy in If another with 

positive weights is used, it may be only first order accurate, and we 
thought that this would conflict with the second order weighted mean scheme. 
At first we assumed that the very unbalanced coefficients generated by the 
weighted mean scheme may be a cause of instability, but the same overall 
behavior obtains in pure diffusion problems, also. 

We also considered a one-dimensional, advection-dif fusion equation with 
constant coefficient, expecting to obtain more insight into the interplay 
of various parameters, by explicitly computing spectral radii and norms 
(see Appendix B) . 

A simple Fourier mode analysis-setting - shows many virtually 

increasing amplitudes. These are reduced only by repeated relaxation on 
the fine grid. However, the analysis may be misleading, because unrealistic 
periodic boundary conditions are assumed. If Dirichlet conditions are 
imposed, then the spectral radii stay below 1, for all m, n, speed and 
diffusivity values (the spectral radii had to be obtained numerically). This 
implies stability and convergence of the multigrid process. 

Some of these results are presented in Appendix B. One general property 
that may be deduced is that relaxation against the flow is always inefficient 
and possibly destabilizing. One broad conclusion may be stated since there 
are initially growing modes, proper care is needed in nonlinear problems. 

CONCLUSIONS 

Efficient and accurate results are obtained by the correction scheme 
and full approximation scheme algorithms for the problem of Benard convection 
in a square cell. For the driven cavity problem, preliminary results have 
been obtained for Reynolds numbers of the order of 1000. The multigrid 
technique for this problem requires further refinement, with particular 
reference to interpolation procedures and grid switching criteria. 
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Diagram 1. Gauss-Seidel Smoothing Rate* 100 


57’ 0^ 95 96 97 .97. .97. .97. 97 ..96, 
^3 87 97 98 98 98 98 97 97 96 

88 98 99 99 99 99 98 _97 96 

44..86 98 99.99. 99 99 .99 98 .96 
A4 8.'^ 97 99 99 99 99 .9.9_98._96 
A^_81 95 99 99_ 99.99 99 98 96. 
AA .77 93 ~9fi' 99 .99 99”9.9" 9.8 .9 5.'^ 
AA 70 87 9A 97 99 99 99 98 93 

AA 61 75 85 91 9A 96 97 96 89 

AA 50 56 61 66. 70 73 75 7A 67 

AA 38 33 28 2A 21 19 17 16 20 

AA 27 1.5 8.. 5 .. 3. 2 . 1 1 ,16 

AA 18 6 2 0 0 0 0 1 17 

AA 12 2 0 0. 0 0 0 2 19 

AA 8 1.0 0 . . 0. . 0 0._ 3 21 

AA. _.5 ' 0 , 0 .0 .. 0. 0 ...1 . 6 .2 5. 

AA A 0 0 0 0 1 3 10 28 

A3 A 1 1 1 2 .A 9 17 33 

Al 'i6...8 9 10 13 16 2 1 29 38. 

55 <VA AA AA AA AA AA AA AA AA 
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Diagram 2. Column Relax. Increasing X Dominant Coefficient 

e.g., N > 0.8 (N+E+S+W) 
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Diagram 3. Column Relax. Increasing X - Smoothing Rate* 100 
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APPENDIX A 

Bounds for the Driven Cavity 


Suppose, for simplicity, that the cavity is unit square, then 

0.08c . < ip < 0.08C 

min — ^ — max 


- 0.08C. 


X max 


< V -ip £ - 0.08C. 


X min 


0.08 (C . -U) + U < u = < 0.08 (C 

^y min — y — y max 

where U is the imposed speed on the top boundary, 
maximum of <J), which solves; 

V^(|) + 1 = 0 

(() = 0 on the boundary 


For Cj one uses the bound 


U) + U 

The constant 0.08 in the 


^min on boundary ~ 


< C. 


interior — max on boundary 


APPENDIX B. 

Initial Mode Amplification 

We discuss here briefly the behavior of the weighted mean scheme 
solution of the problem: 

» (cr = 0.01) 

(})(0) = <P(1) = 1 

with the initial guess: 

(J>k = (0<w<7r, 0<k<N) 

Using N = 2M subintervals on the fine grid, we perform m fine relaxations. 
(Gauss-Seidel with x increasing), then m coarse relaxations, and record 
the largest magnitude of the resulting (j) as a function of o). This is 
maximum at w=7T, but even at other values of o) (e.g., to=7r/4) this quantity 
may exceed 1. 

When a full multigrid procedure is implemented, there is always 
convergence to (p=l - the modes, which initially grow, decay subsequently. 
Comparison of results at one given speed U - columns d, e, f, (say) of 
Table 4 shows that a large ratio of coarse iterations versus fine iterations 
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produces high amplification. It is to be noted that columns d, e, f concern 
a pure diffusion problem; the weighted mean scheme reduces to simple central 
differencing, and yet there are some growing modes. Relaxation sweep against 
the flow (U=“l, columns a, b, c) consistently produces the worst amplification. 
Similar results may be derived by the Fourier mode analysis. 
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-1 


0 


1 


Speed U 


Fine/ coarse 
relax. 

aj=iT M=5 
19 
33 
47 


U)=TT/y M=5 
19 
33 
47 


5/1 

1 

1/1 

1/5 

5/1 

1/1 

1/5 

5/1 

1/1 

1/5 

2.4966 

2.4972 

3.0774 

1.0000 

5.2891 

5.7665 

1.0000 

1.0004 

1.0004 

2.8570 

4.8923 

20.9650 

1.0000 

1 5.6666 

26.9670 

1.0015 

3.6451 

3.0393 

1.1244 

4.8289 

17.9545 

1.0000 

5.6666 

27.0000 

1.0013 

9.7055 

19.0032 

1.0215 

4.8735 

22.0000 

1.0000 

5 . 6666 

27.0000 

1.0000 

9.0426 

39.0100 

2.2182 

2.2186 

2.5088 

1.0000 

1.2732 

1.4704 

1.0000 

1.0000 

1.0000 

1.6508 

2.1982 

2.2339 

1.0000 

1.4022 

1.2859 

1.0015 

1.0016 

1.0026 

1.0769 

2.0962 

2.1007 

1.0000 

1.4103 

1.2911 

1.0016 

1.0000 

1.0250 

1.0193 

1.9888 

1.9674 

1.0000 

1.4122 

1.2925 

1.0000 

1.0000 

1.0564 

a 

b 

c 

d 

e 

f 

g 

h 

i 
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A MULTIGRID METHOD FOR THE TRANSONIC FULL POTENTIAL EQUATION 
DISCRETIZED WITH FINITE ELEMENTS ON AN ARBITRARY BODY FITTED MESH 

Herman DEC ON IN CK Charles HIR3CH 

Vrije Universiteit Brussel , Department of Fluid Mechanics 


ABSTRACT 


A multigrid method for the acceleration of transonic potential flow cal- 
culations based on a Galerkin Finite Element approach is described. In order 
to allov the use of arbitrary body fitted meshes it is necessary to introduce 
non uniform interpolation and residual weighting. Emphasis is put on the 
construction of these operators consistent with the Finite Element approxima- 
tion, while standard successive line overrelaxation is used as smoothing 
step. Substantial convergence acceleration is obtained and results are pre- 
sented for different transonic flow configurations including shocks. 


INTRODUCTION 


The multigrid method was originally introduced for the solution of the 
system of equations obtained from the finite difference (F.D.) discretiza- 
tion of elliptic partial differential equations by Fedorenko (ref.l), exten- 
ded by Bakhalov (Ref. 2) and further developed by Brandt (ref. l6). It is 
based on the idea that corrections for the solution on a fine grid can be 
effectively approximated on a coarse grid with help of the common underlying 
differential equation. 

Finite Element (F.E.) applications were soon recognized and at the pre- 
sent time the mathematical foundations are even better established than in 
the F.D. case although practical implementations are rare. Convergence 
proofs under fairly general conditions for elliptic boundary value problems 
were obtained by Nicolaides (refs. 3,^), Hackbusch (ref. 5) and others. One 
of the basic conclusions of these investigations is that the convergence of 
the multigrid methods is independent of the step size and that the amount 
of computational work for solving the discrete system of n unknows is pro- 
portional to n. Practical aspects of the F.E. implementation on model pro- 
blems are given in Brandt (ref. 6) and Nicolaides (ref. 7) who describes ex- 
tensive numerical results obtained for a Poisson equation and another ellip- 
tic equation with variable coefficients and mixed boundary conditions, both 
on a uniformly discretized rectangular domain. These results confirm the 
convergence rates obtained with Finite Differences. 

In transonic flow computations, the first multigrid solutions have been 
proposed by South and Brandt (ref. 8) with the transonic small perturbation 
equation and successive line relaxation (SLOR) as smoothing operator. Pro- 
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■blems were encountered in the treatment of the boundary conditions and in 
calculations on nonuniform and curvilinear grids, probably due to a lack of 
smoothing on a fine grid before passing to a coarser one or due to an unsa- 
tisfactory residual weighting. Jameson (ref. 9) solved the transonic full 
potential equation on an arbitrary mesh and obtained very satisfying resTilts 
with a generalized ADI as smoothing step. As in most other applications the- 
se multigrid methods are implemented on a rectangular (or circular) uniform 
computational mesh obtained from a mapping of the original physical curvili- 
near mesh allowing uniform interpolations. This approach is appropriate in 
cases where the physical problem and boundary conditions are transformed by 
a global coordinate transformation as in most finite difference methods. 
However, the classical F.E. approach handles the problem in the physical pla- 
ne and uses only a local mapping of each curvilinear element to a reference 
parent element to facilitate the volume integrations needed in the computa- 
tion. 


Simple uniform interpolation is only obtained if the fine mesh elements 
are uniform subdivisions of a coarse grid element. This would pose a severe 
limit on the finest mesh that can be achieved since only the mesh points of 
the coarsest mesh could be chosen in an arbitraiy way. Therefore non uni- 
form inte 2 :p)olati on and residual wei^iting is introduced in this paper preser- 
ving the same flexibility with respect to the geometry as the usual F.E. 
methods. An advantage of the F.E. treatment is that the method leads to na- 
tural choices for the interpolation and weighing, even on the boundaries of 
the domain. 

Indeed, a simple but articifial residual injection following the lines 
of F.D. methods has been tried with poor results confirming the observations 
of Nicolai des (ref. 7) on a simple rectangular domain. It turns out that 
the amount of additional work due to the non uniformity is reduced due to 
the fact that the same numerical coefficients are needed for coarse to fine 
interpolations as for the fine to coarse weighting. 

In the present investigation successive line relaxation with downstream 
sweep direction is used as smoothing component. Alternatively this smoothing 
operator can be replaced by the F.E. ADI method developed in the past (ref. 
10 ) . 


Numerical experiments on channel, single airfoil and cascade geometries 
indicate a substantial convergence acceleration compared to the grid refine- 
ment technique which consists in the application of SLOE to successively fi- 
ner grids with the previous coarse grid solution as initial approximation. 


EQUATION AND F.E. APPROXIMATION WITH ISOPARAMETRIC ELEMENTS 


A brief account of the F.E. treatment is given here. More details can 
be found in previous publications and the references contained there in 
(refs. 13, 14, 15). 
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( 1 ) 


The potential equation in conservative form is given by 
9{p(|)) + 9(p<j))=0 

^ y K Yy 

where x and. y are the Cartesian coordinates in the physical plane and. (j)^, <p 
the velocity components . ^ ^ 


( 2 ) 


The density p is obtained from the isentropic relation 

p = p. [1 - (<I>^ + 

^ Y T V 


where and T^ are stagnation density and temperature and y "the ratio of 
specific heats. 

In transonic flow regime equation ( 1 ) is mixed elliptic-hyperbolic and 
allows different weak solutions for a given set of boundary conditions. If 
proper viscosity terms are added to the equation a unique solutions is again 
guaranteed which is equal to the physical solution except for a small region 
around shocks (ref, 11 ). 

The artificial density form of the artificial viscosity terms, due to 
Hafez, Murman and South (ref. 12) is particularly well suited for F.E. appli- 
cations and works satisfactorily for flows with Machnumbers up to 1.5 (refs. 
lU, 15). It is obtained by giving an upwind bias to the density which is re- 
placed by 

(3) p = p - y p-^ As 

where p^ is the upwind derivative of p along the streamwise direction s. As 
the meshspacing and y a switching function with cut-off Machnumber M which 
controls the amount of artifical viscosity 

m" 

(3b) M = max (0,1 |) 

M 


A Finite Element weighted residual approach is based on the weak formu- 
lation of ( 1 ) given by 


( 4 ) 


R(<|)) = 


a. 

P 


VW V(j) dS 


W 


3n 


ds = 0 


for any continuous testfunction W, where S is the physical flowdomain with 
boundary s. The functional R(c()) is called residual. The integral over the 
boundary is the expression of the Neumann boundary conditions (B.C.) which 
are part of the problems specification. Three types of geometry are consi- 
dered each giving different specific Neuman B.C. : channel geometry, single 
airfoil and cascade geometry : Channel walls and blade or profile boundaries 
require the no flux condition 


9 
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Points belonging to periodic boundaries in cascade geometries are treated 
as interior points by letting coincide corresponding periodic points (ref. 

^h) , At inlet and outlet boundaries either the solution is given (Dirichlet 
condition) or the mass flow rate p(3<j>/9n) is specified directly or in an ite- 
rative way by applying a Kutta Youkowski condition at the trailing edge while 
the far field condition for the single airfoil geometry is also introduced 
by forcing the known mass flow rate trough the far field boundary. 


A F.E. approximation of a function <()(x,y) is obtained by defining a fi- 
nite dimensional space with basis functions N^.(x,y) attached to a set of 
meshpoints (i,j) spread over the flow domain S : 


(5) 4>^(x,y) = E 4 .^. (x,y) 

where (J)^ . are the meshpgint values of (|)^ and h the typical mesh size charac- 
teristic^of the space S . It follows from (5) that 


( 6 ) 




) = 


5^1 

ij 


I 1 for (i,o ) = (k,l) 
0 otherwise 


A discrete Galerkin approximation for the weak form (h) is found by ta- 
king a finite number of test functions ¥, namely the basis functions of space 
S 5 giving the following non linear system of equations for the meshpoint 
values 


(7) 


r’?. = 

IJ 


k,l 


,^1/ ^ _ 

ij 




where K((p^) is the stiffness matrix and f^ the contribution of the Neuman B.C. 


( 8 ) 


J 


p(4>^) VN^^. dS 

>. -K-L 1 J 


and f 


h 

ij 


p N. . ds 
9n ij 


It is well knownthat exactly the same expression ^for the residual is found 
by solving the discrete minimization problem in S in cases where a minimum 
principle equivalent to the equation can be formulated (as in the fully 
elliptic subsonic case). 


Expression (T) for the residual is developed in the physical plane and 
written in physical coordinates and can be evaluated for any trial function 
(p after a choice of the type of element has been made which determines the 
type of basis functions of the space S . In reference i5 bilinear and biqua- 
dratic Lagrange elements have been used, the latter allowing third order 
accuracy and parabolic approximation of the boundaries. With these elements 
the integrations over an element surface (eq. 8 ) are usually carried out with 
Gauss quadrature after transformation of the arbitrari^shaped element to a 
unit square. In the standard F.E. treatment this transformation is the lo- 
cally defined isoparametric mapping : i 


( 9 . a) 


h, hv 

X (5 ,n ) = 




h „h / ^h h X 
X. . N. .(5 ,n ) 
ij iJ 
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(9.-b) 


h-_h hv _ h /_h h, 

y (? ,n ) = ^ y^. N^-(5 ,n ) 

isj 

w|iich is completely determined by the mapping of the j^e^hpoints of the space 
S causing arbitra^ ^ocated meshpoints of the grid S not to be mapped 
uniformly in the (5 ,t) ) plane. 

The discrete non linear system (eq. T) has been solved with the usual 
iterative methods such as successive line overrelaxation (SLOR) and approxi- 
mate factorization (ADl) for which a F.E. version was developed (ref. 10). 
The simple SLOR method is reliable but extremely slow due to the fact that 
it elimates effectively only the errors with wavelength comparable to the 
meshwidth h. Substantial convergence acceleration was achieved by solving 
the series of N+1 problems 

(10) R. . = 0 n=N , N-1 , . . . , 1 , 0 

2% 

defined in the space S where the errors of wavelength 2^^ are eliminated 
effectively and the computational effort reduced. 

In this grid refinement technique the influence of the coarse meshes is 
only sensible trough the initial approximation for the next finer mesh, 
while in the full multigrid approach described subsequently the coarse grid 
equations are modified in order to represent meaning full approximations of 
the fine grid corrections. 


MULTIGRID ALGORITHM 


The multigrid approach is based on a different treatment of low and high 
frequency errors in the approximate solution : the high frequency error com- 
ponents can only be resolved on a fine grid and are fortunately eliminated 
efficiently by existing relaxation techniques. Low frequency components on 
the other hand are nearly unaffected by relaxation but they are scaled with 
the dimensions of the physical domain and hence can be eliminated on a coar- 
ser grid where the computational effort is lower and the propagation of cor- 
rections trough the domain much more rapid. 

Considering the system of non linear equations ($q. 7) constructed on 
the finest mesh with characteristic spacing h : 

(11 ) = 0 

which may ^e written in correction form with respect to a known approximate 

solution d) 

^n 

(12) k’^(6(},^) = K^((|)^ + - k’^((|>^) = - r’^((|>^) 

ll 

where the unknowns are now the corrections 6<|) given by : 
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INI MINI II Mil I 


( 13 ) <|)^ = 


Supposing that the hi^ frequency errors have heen eliminated effectively hy 
means of a^smgothing operation, such as SLOR or ADI, the correction <S(f) and 
residual R^ (<j) ) may he considered as smoothly varying quantities for which 
an approximation on a coarser grid makes sense. This mesh with typical 
spacing 2h is obtained by dropping the^^odd numbered coordinate lines of the 
mesh and an updated approximation can be calculate^accordigg to 

(eq. 13) by interpolating the coarse grid approximation 6c() for 6cf) back 
to the original mesh 


(14) 


*^n+1 




6(J) 


2h 


where 12^^ is the cosirse to fine grid function interpolation operator called 
"prolongation". The coarse grid approximation 6(() for the fine grid cor- 
rection is the solution of the following equation on the coarse mesh : 


(15) 


= - 




R 2h 

The fine to coarse residual restriction ope^gtor constructs a meaningful 1 

approximation of .the coarse grid residual R based on the smoothly vaiying 
fine grid residuals. 


By defining a coarse grid solution (j) 
coarse grid : 


2h 


as the approximation of (j) on the 


(16) 



^n 


6 (|) 


2h 


2h 

where I is the function restriction, 
of eq. T1 : 


eq. 


15 takes again the usual form 


(17) K^’^((j)^^) = f^’^ 

where the right hand side is a known function of the fine grid approximate 
solution : 


( 18 ) 


^2h _ _ Rj2h + K^^d^h ^h^ 


2h 

and(S(() can he eliminated from the updating formula (l4) by means of (l6) 


( 19 ) 


n+1 


= *n * 4<' 


2h 


- if 


The solution in turn can be approximated on the mesh S when it is suf- 
ficiently smooth i.e. the whole procedure can be applied in a recursive way 
to eq. (17). This non linear algorithm (F.A.S. scheme) is due to Brandt 
(ref. 16) who describes an adaptive strategy for the transition to a coar- 
ser or finer grid depending on the convergence level and speed on a particu- 
lar grid. A more simple fixed strategy has been used in the present work 
(ref. 7) : Starting on the finest mesh with spacing h one line overrelaxation 
sweep is perfoimed followed by the transition to the next coarser grid by 
means of eqs. 17 and I8 until the coarsest grid is reached. On the coarsest 
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gjad some additional relaxation sleeps are formed and the solution of the 
next finer grid is updated by means of eq. 19 followed by one relaxation 
step until the second finest grid is reached. The cycle terminates with the 
updating of the finest grid approximate solution with help of eq. 19 - 

2h 

As distinct from F.D. approaches the interpolation operators 
agd are not arbitrary but based on the F.E. interpolation spaces 

S and They are considered in some more detail in the following 

tions . 


2h 

sec- 


THE COARSE TO PINE GRID FUNCTION INTERPOLATION : OPERATOR I^, 

2h 


The only natural choice fgr tge interpolation of a coarse^gesh function 
(J) to a Jine^mesh location (x- . ,7^^ . ) is to use the value in "the lo- 

cation (x. .,y. .) given by the i’^E. ^Approximation in space S : 


ij ij 


( 20 ) 


.h ,2h, ^2h/ h h ^ 

ij ^ ij’ ij 


[I <J) ] 

iJ-2h 'P i 


k,l 


jij 

kl 


where the matrix I^^ is given by 


\l 


(21) 


kl kl^ ij’^ij'^ 


On. an arbitrary mesh this results in non uniform interpolation coefficients 
and for instance with bilinear elements (figure 1) uniform interpolation 
is only obtained if the fine grid meshpoints are situated in the middle of 
the coarse grid element sides and in the center giving only in this case the 
simple formula (figure la) : 


. h 2h, _ 1 , 2h 

f^2h ^ Ic 


,2h 


2h 2h^ 
l>3 + <l>4 ) 


for the centernode 


( 22 ) 


[I 

[i; 


h , 2h 
2h 

h 




2h 


2h ^ h 


^ i 


,2h 


) for the midside node i-j 


for the corner nodes (identity) 


It follows that simple uniform interpolation is only possible for uniform 
refinements of the coarsest mesh which could be chosen arbitrarily. 


In the general case with bilinear elements (figure 1b) four coefficients 

are needed for each fine grid meshpoint not coinciding with a coarse grid 

meshpoint. The computation of these general coefficients (eq. 21) is not 

trivial since N^^(x,y) is not explicitly known for an arbitrarily shaped 

element and^one has^^first to invert the isoparametric transformation (eq. 9) 

to obtain E. ^ and n • • from ; 
ij ij 


(23-a) 


h 

X. . 

ij 


= E 
m^n 


2h „2h / ^h h V 
K N (g. . ,n. - ) 
m,n m,n i j ’ ij 
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(23.^) 


h „ 2h „2h / j.h h v 

y- • = S y N (5,- • ,ri - • ) 

"^xj ‘'m,n m,n ij ’ xj 

m,!! 


after which the computation is carried out in the plane where the basis- 
functions are simple polynomial expressions : 


(2h) 


h h ^,2h/-h h v 


2 h 

With the bilinear elements for instance N is of the form 

(25) (1±c)(1±n) 

when the four corner points are situated at ( 5 , 11 ) = (±1,11) 



FINE TO COAESE GRID FUNCTION INTERPOLATION (RESTRICTION) ; OPERATOR I^ 

The value og in the coarse mesh location calculated with the F.E. ap- 
proximation in S leads to the identity since the coarse gridpoints belong 
also to the fine grid, 

(26) 

which due to 

( 27 ) 

This type of 


r^2h ,h, h, 2h 2h,. h ^ , 2h 2h^ 

[i^ ♦ hj = ♦ 

(eg, 6) reduces to 
^ h ^ xj ^xj 

restriction is sometimes called injection. 
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FINE TO COABSE GRID INTEGRAL INTERPOLATION 


R P'h 

OPERATOR 17 
h 


As distinct from the F.D. case the residual is an integral quantity 
which is scaled differently on different grids. It cannot he represented 
in the spaces S” and and the previous interpolation rules are inapplica- 
ble. In the Galerkin approach the volume integrals are always of the form 


(28) 


r_2h, , 2h 

[R (<t> )] 


ij 




dS 


For instance the residual in eq. 7 can he rewritten in this form with : 
(29) g(<l>^^) = v(p( 4 )^) V(j.^^) 


2h 

A consistent representation of by means of fine grid quantities is 
found by approximating the coarse mesh"^ functions in the integrant of (eq. 28) 
with fine mesh interpolations in the space namely 


(30) 




_2h -,2h r T^h . h x 

„2h \ ”ij «■ ) 


,2h . 


where • is the coarse mesh residual integration domain, i.e, the part 

of S whe!^e ^ 0 (figure 2). 

ij 

h 

The interpolation of <() (x,y) to the coarse mesh leads again to the iden- 
tity since the interpolation of meshpoint values is the identity by virtue 
of eq. 27 : 


(31) 


if = I » I <i(-.y) 

k,l k,l 

v2h. 


.h 


(x,y) 


In the same way the coarse mesh basis function N. .(x,y) is approximated in the 
space 

/'ooN -r-2b ,,2h/ X - _;h / X _2h ^^2hf h h x 

(32) Ij^ N^j(x,y) ^i:^N^^(x,y) 1^^ ^ij^\l’^kl^ 

which due to eq. 27 and 21 leads to 


( 33 : 


if nf (X.y) = »J^(x,y) 1^1 


The final expression for the coarse mesh residual wei^ting by means of fine 
grid quantities is obtained from eq. 30 by inserting the expressions (31) 
and (33) : 


(34) 


.R 2h h. 

I Ij^ R 


’ ij 


On a imiformly subdivided coarse mesh (figure 2a) it is clear that this 
general ejqiression reduces to 
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( 35 ) 


jRl2h ^ ^ 

h ij jj.^1 iJ 


2ti 

where the summation extends only over the 9 inner points in the domain S. . 

since the coefficients I. . are zero on and outside the boundaries of 

ij ij 

Comparing eqs . 20 and 35 one concludes that the coarse to fine mesh in- 
terpolation is the adjoint of the residual weighting since they 

have transposed coefficient matrices : 


[i:. 


h , 2h- 


(36) 


2h 


>id - 


k,l 


jij ^Sh 
^kl ‘*^kl 


.kl „h 


jRl2h ^ ^ ^ .. 

h k,l 


The following result is obtained for uniform subdivisions (figure 2a), vrhich 
corresponds to the uniform interpolation (22) 


(37) 


.h 




R^] . . = R^. + ■1(R^‘ . + R^ . + RV . + R“ .) 

h ij ij 2 i,j+1 i,J-1 1-1, J 1 + 1 ,J 


1 , 0^1 1 , 0 - 1 ^ 



Fig. 2a 


uniformly subdivided 

mesh 


Fig. 2b : Arbitrarily subdivided mesh 
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On an arbitrarily subdivided mesh (figure 2b) the situation is different 
due to non overlapping integration domains for the coarse and fine mesh. If 
one is willing to apply the formula (35) with summation over the 9 innermost 
meshpoints in figure 2b and with the correct non uniform coefficients, two 
sources of errors are introduced with respect to the exact formula (3^) : 

. First the contributions of points (i±1,j±2) and (i±2,j±1 ) lying inside the 
coarse residual integration domain are omitted. For mildly distorted grids 
their contributions are negligible since the coefficients are small 

near and zero on or outside the limits of S?4 and also due to the fact that 
the integral in eq. 3^ extends over only two"^fine mesh elements compared 
to U for the other points . 

. Secondly the fine mesh integration domain for points (i±1,j±1), (i,j±l) and 
(i±1,j) are not always completely contained in the coarse mesh domain S?^ . 
Again the errors are small since the surface differences are small and 
more over since the integrants in eq. 34 approach zero near the limits of 
the fine mesh integration domain. 

In conclusion, eq. 35 remains an extremely valuable approximation for 
eq. 34 in the arbitrary mesh case, of course only when used with the arbitra- 
ry mesh interpolation coefficient already known from the non uniform 

interpolation. 

It remains equally valid on the Neumann boundaries of the physical do- 
main where the summation extends over 6 fine meshpoints and 4 for boundary 
corners . 


The same expression derived here was also obtained for orthogonal meshes 
by Nicolai des (ref. 7) and Brandt (ref. 6) based on the minimization approach. 
Brandt suggests that this "natural” choice is not always better than the re- 
sidual injection which is simply given by : 


r’^] . . = 4 ph 
h ij ij 


in the uniform case. 


This has not been confirmed by Nicolaides in his numerical experiments 
on a square uniform domain. On an arbitrary mesh residual injection could 
be constructed with some theoretical support by supposing the fimction g(<()^) 
constant over the coarse mesh residual integration domain allowing the fol- 
lowing approximations : 


^2h _ /.2h. ..^2h 

R g((() ) /N. . dS 


R^. g((|)^) /N^. dS 

ij ij 


and hence 

(Uo) 







dS 

ij 
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giving exactly expression (38) in the \iniform case. 

Computational experience with eq. kO was hi^ly unsatisfactory and 
showed that it is inapplicable, at least with the simple smoothing procedure 
used in this paper. 


COMPUTATIONAL RESULTS 


The convergence of the con^jutation is measured by the evolution of the 
average residual on the finest grid in terms of the work count which is de~ 
fined in units representing the work needed for one line relaxation sweep on 
the finest mesh. For instance, the work coiuit of one complete multigrid 
cycle with four grids and for the present strategy is given by 

■I 2(-J- + -i^) + ^ = 1 ,91 units 

plus the additional work for the residual weighting and other inteip)olations . 
The convergence rate as used below is defined as the mean reduction in the 
average residual per unit of work. An initial approximate solution on the 
finest grid for the multigrid iteration is calculated by applying the grid 
refinement technique with five relaxation sweeps per grid. 

All computations are carried out on a fine mesh with 73 x 25 meshpoints 
and successive coarser meshes of 37 x 13, 19 x 7 and 10 x U meshpoints. 

Three sets of testcases are presented with different geometric boundary con- 
ditions . 

The first set is the non lifting NACA 0012 single airfoil configuration 
for which the mesh generation method of ref. 17 was adopted, however lea- 
ving out the symmetric lower half part of the mesh since only symmetric non 
lifting flows can be treated with the present code which is primarLly inten- 
ded for cascade flow computations. 

With a free stream Machnumber of .80 the standard workshop mesh (ref. 17) 
was used. In figure 3 the pressure distribution with a shock of moderate 
strength is compared with the results obtained by other participants showing 
good agreement. The evolution of the average residual is given in figure 5 
where the influence of the number of grids is apparent. The convergence ra- 
te is improved from .967 for 2 grids to .900 for ^ grids. The high speed 
with which the flow pattern is established is illustrated on figure h where 
the pressure distribution obtained after 1, 2, 4, 7, 10 and 13 multigrid 
cycles is shown. The solution is converged after 10 cycles except for a 
small overshoot ahead of the shock which is suppressed after 13 cycles, when 
the average residual is still only reduced by 2 10“^. 

The residual evolution for the grid refining method is also plotted on 
figure 5 5 beginning at 50 work units which is the amount of work carried out 
on the coarse grids before passing to the final mesh. A very fast initial 
reduction of the mean residual is seen which corresponds to a fast suppres- 
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sion of the hi^ frequency errors by the consecutive relaxation sweeps. The 
remaining low frequency errors are not eliminated and cause the convergence 
to slow down after a small number of relaxation steps. 

Figure 7 shows the solution for the flow at Mach .85 containing a strong 
shock from M=1.i+ to M=.7* For this solution the workshop mesh was used with 
a far field boundary at 8 chordlen^hs from the airfoil. The convergence 
history is plotted in figure 6 showing a rate of .957 with h grids. The con- 
vergence rate with grid refining approaches 1 after 50 SLOR iterations and in 
fact no further improvement of the solution coiald be obtained althougji it was 
not converged as is seen in figure 7, dotted line. The non converged solu- 
tion has a sharp shock ahead of the correct converged shock position and is 
very similar to some of the non conservative solutions presented in ref. 17. 
As can be seen in figure 8 the correct shock position with the multigrid 
scheme is already obtained after 50 work units (21 multigrid cycles) with a 
mean residual reduction of only 5 10^^, showing again the extremely fast eli- 
mination of low frequency errors. 

The second set of results is calculated for a channel flow with a cir- 
cular bump on the lower wall with the standard workshop mesh (ref. 17) which 
is a sheared Cartesian system. It was given as a testcase with an isentropic 
inlet Machnumber of .85. This Machnumber corresponds to a choked flow for 
the potential solution as was confirmed by the results of Veuillot and Vi- 
viand while Jameson did not succeed to obtain a solution for a Machnumber 
higher that .835. On the other hand, all other potential solutions, inclu- 
ding our grid refinement solution were far from choked namely with a peak 
Machnumber of ±.92 on the upper wall. Again it is clear that this solution 
is not converged. Indeed the multi grid solution converges at M=.8U9 with a 
choked solution as is shown in figure 10 for the pressure distribution and 
figure 11 for the isomach lines. In this case 300 work units were performed 
with an average residual reduction of 4.2 10""^. The pressure distribution 
(figure 10) is compared with the solution obtained by Veuillot and Viviand 
at M=.8500 with their pseudo time dependent fully conservative potential 
method (ref. 17). Our solution at M=.85 diverges due to the fact that at 
this Machnumber the imposed mass flow rate is higher than the choking mass flow 
obtained at M=.849. In figiare 9 our solution at M=.835 is compared with the 
solution of Jameson obtained with his multigrid ADI scheme (MAD) on a 65 x 
17 meshpoints grid allowing 4 or 5 different grids. The residual evolution 
with our method (figure 12) shows a constant convergence rate of .963 after 
an oscillatoi^ behavioirr during 100 work units. The rate obtained by Jame- 
son with MAD for this case was slightly slower namely .9742 (ref. 17). 

The final result is a subcritical compressor cascade flow. The mesh is 
generated by solving a system of elliptic partial differential equations for 
the curvilinear coordinates (ref. l4). The convergence history is shown 
in figure 13 and compared with the grid refining. The rate obtained with 
four grids is .874 and illustrates that the periodic and Neumann boundary 
conditions have no adverse effect on the convergence speed obtained with our 
multigrid scheme although no special treatment of the boundaries as sugges- 
ted by Brandt has been introduced. 


73 



CONCLUSION 


A conceptual simple multigrid scheme has heen developed consistent with 
the finite element method and applicable to general arbitrarily generated 
body fitted grids. Therefore non uniform interpolation and residual weigh- 
ting operators had to be introduced. A fast and reliable method is obtained 
with the simple straight fo27ward line relaxation scheme as smoothing step 
allowing the cadciilation of realistic transonic flows with about 10 to 20 
mioltigrid cycles (30 to 50 work units). 
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MULTI-GRID SOLUTION OF THE NAVIER-STOKES EQUATIONS ON NON-UNIFORM GRIDS 


Laszlo Fuchs 

The Royal Institute of Technology, Stockholm 


SUMMARY 

The numerical solution of the Navier-Stokes equations in general two dim- 
ensional domains is considered. A proper finite-difference approximation to the 
governing equations, on non-staggered grids, in a transformed plane, is formu- 
lated. Several aspects of a Multi-Grid method for the solution of the finite- 
difference equations are described. The emphasis in this paper is on the effi- 
ciency of some relaxation schemes and the transfer among the grids which are 
non-uniform in the physical plane. 


INTRODUCTION 

Multi-Grid (MG) methods have been applied with success to many boundary 
value problems. Both elliptic equations [1-4] and mixed elliptic-hyperbolic 
equations [5-7] have been solved with high efficiency. However, the MG method 
is not, yet, considered to be a general purpose method because of its relative 
complexity. The method is usually sensitive to basic errors which can be made 
even by a MG-minded user. For this and other reasons it is important to under- 
stand not only the basic principles of the MG solution procedures but also to 
develop, as general as possible, user oriented codes for the solution of spe- 
cific equations. In this work we discuss some aspects which are of importance 
for the development of Navier-Stokes solvers in general two dimensional domains. 

In most applications so far, the MG codes have been written for problems 
in cartesian coordinates. This is natural for testing the basic principles of 
the method. The idea of using uniform cartesian grids has been extended to 
include globally non-uniform grids, by using a sequence of uniform grids such 
that some of these are applied locally [1]. This approach can be used without 
too much difficulties if the boundaries of the computational domain can be 
approximated easily by rectangular meshes. For general geometries such an app- 
roach might be too complex and less accurate than the method of the transfor- 
mation of the coordinates. 

We distinguish between two cases where mesh refinements are needed in the 
physical plane. That is, when the mesh should be refined in order to resolve 
geometrical details, or when physical details of the solution are to be re- 
solved. The purpose of the transformation of the coordinates is to make the 
treatment of general geometries easier, while the mesh refinements needed to 
resolve the solution are to be treated adaptively as part of the solution pro- 
cedure. Mesh refinements can be done if and when such are needed by introducing 
fine uniform meshes locally, in the rectangular computational domain (e.g. in 
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the way that is described in reference 1). 

By the transformation of the coordinates, the boundaries become coordinate 
lines and this simplifies the application of the boundary conditions. Tha main 
disadvantage of such transformations is that the equations may become more 
complex. "New" terms turn up and the coefficients multiplying the derivatives 
may vary consideribly throughout the computational domain. Beside the complex- 
ity in programing, the transformation of the coordinates may reduce the comp- 
utational efficiency of methods which work well on uniform cartesian grids. 
Thus, working with transformation of the coordinates means that additional at- 
tention must be paid to the formulation of the governing equations, the dis- 
cretization of these equations, the choice of the relaxation scheme and the 
transfer among the grids. In this work we discuss some of these aspects with 
regard to the MG method for the solution of the Navier-Stokes equations in a 
plane. The governing equations are stated and discretized "elliptically" . A 
proper choice for transfer among the grids is related to the governing equa- 
tions. By such a transfer the continuity equation and the compatibility con- 
dition can be satisfied to the same accuracy on all grids. Some relaxation 
schemes, including the so called "Convective" Successive Line Relaxation (C- 
SLR) scheme for the momentum equations, are described. These are analysed by 
local mode analysis and are tested for some coordinate transformations and dif- 
ferent values of Reynolds numbers. The C-SLR scheme is superior to standard 
schemes, and it may be used as a general purpose relaxation scheme in many 
cases. Preliminary results for the solution of the Navier-Stokes equations for 
the flow in rectangular and polar cavities are also given. 


THE GOVERNING EQUATIONS 

The steady state Navier-Stokes equations for incompressible viscous New- 
tonian fluids, in two dimensional cartesian coordinates, are given by 


u -HU - p - Re 

(uu -H vu ) = 0 

(l.a) 

XX yy ^x 

X y 

V + V - p - Re 

(uv + vv ) = 0 

(l.b) 

XX yy ^y 

X y 

U + V 

= 0 

(1.c) 

X y 



where Re is the Reynolds number, u and v are the dimensionless velocity comp- 
onents in the x and y directions, respectively, p is the dimensionless pres- 
sure scaled by the Reynolds number. 

Proper boundary conditions are defined if the velocity vector q = (u,v) 
is given on the boundaries, provided that the compatibility condition 

// (u + V ) dx dy = /q*n dl (l.d) 

a ^ ^ 3R 

= 0 

is satisfied. (3ft is the boundary of the domain n and n is the outward point- 
ing unit vector normal to 3JJ). 
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Usually no slip boundary conditions are taken on the surface of physical 
bodies. Mixed boundary conditions may be used in cases of symmetry. 


When the physical plane is transformed into the computational plane (£,n) 
the governing equations can be written in the form: 


J[ (C^Uj . C,u^ + (C,Uj . 

- Re[u(5 u^ + n u ) + v(c u^ + n u )] = 0 
c X It y 5 y n 


(2. a) 


J[ (C,Vj . CjV^ . (CjV^ . J - (s^p^ 

- Re[u(E^v^, + n^v^) . . n^v^)] = 0 


( Cu^+nu +Cv^+nv ) 
X 5 X n y 5 y n 


0 


(2.b) 

(2.c) 


where 


and 


Ci u )/j 


^2 = 


n‘ )/J 




= ^x\ - S ’'x 


u and V are the components of the velocity vector q in the x and y directions, 
respectively. 


Due to the transformation of the coordinates the boundary conditions are 
simplified in the sense that u and v are given on the boundaries of a unit 
rectangle in the ^,ri plane. 


Equations (2) are written in a non-conservative form. Conservative forms 
are preferable in many cases. Such a form is essential for the continuity 
equation (2.c). It can be written in conservative form as: 

u + V = 0 (3) 

C n 

where u and v are the velocity components of q in the C and n directions, res- 
pectively. 

u □ (c u + ^ v)/J 

and . " ^ 

V = (n u + n v)/J 
X y 

The integral of equation (3) on any closed (simply connected) domain 
gives; 

//(u + V ) d^ dn = /(u,v)*n dl (4) 

^ ^ dQ 

The compatibility condition is satisfied if the flux through the boundaries 
vanishes. The numerical integral analog to equation (4) is important for the 
correct transfer of residuals from fine to coarse grids. The momentum equa- 
tions are left in their non-conservative form (eqs. (2. a) and(2.b)). 
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FINITE-DIFFERENCE APPROXIMATIONS 


The system of the governing equations (1) is elliptic. It is, therefore, 
natural that this property shall be transferee! to the finite difference app- 
roximations as well. Some concepts of ellipticity of finite difference equa- 
tions (as the mesh size goes to zero) are described in reference 2 and the 
references in that paper. For finite mesh size, Brandt and Dinar [2] use the 
concept of hT-ellipticity measure. This concept is directly related to the 
possibility of devising a proper relaxation scheme in the MG sense. 


For simplicity we consider the discretization of the problem in car- 
tesian coordinates. The following notations for the finite difference app- 
roximations are used: 


X 

X 

.C 


(a^ 

X 

3 ^ 3 ^ 
X X 


i+1.j 


- < >i,J 

i.j ~ ^ 


'i.J - < 

- 3®)/2 




]/h 


3 * = 3 *^ 3 ® 

y y y 


= 3 ^ + 3 * 
h X y 

where h is the mesh size in both x and y directions. 


First, the Stokes problem (Re = 0.) is considered. If second order accu- 
rate central differences are used (on non-staggered grids) to approximate 
all the terms in equations (1), then one may write the finite difference eq- 
uations as: 


with 


^h = 


h 

,C 


L. 4> = R 
h 


(5. a) 


$ 

R 


(u,v,p) 

(0,0,0) 


T 


The symbol of (see Section 3.2 in reference 2), ^^^(-^ 1 ,^ 2 ), is given by 



This operator is not elliptic (it is quasi elliptic) and a solution of equa- 
tion (5. a) approximates the solution of the differential equations only in 
average. 

Elliptic operators may be obtained if staggered grids are used together 
with central differences. Such grids has also, been used for the time depen- 
dent case [ 8 ]. Brandt and Dinar [2] have used staggered grids for the solution 
of the two dimensional Navier-Stokes equations in cartesian coordinates. In 
reference 9 a version of this method has been compared with other MG solvers 
of the Navier-Stokes equations. An application of the three dimensional stag- 
gered grid solver is described in reference 10 . 


Staggered grid formulation looses its attractiveness when cross terms 
are to be discretized. Such cross terms may appear after the transformation 
of the coordinates. For this reason we use, here, non-staggered grids. An el- 
liptic approximation to equations ( 1 ) is obtained by the following operator 
in equation (5. a): 






0 



L 


h 


0 


y 



(6) 




+ 9^), with the symbol proportional to (cos^i+ cos^-. 


Thus det L. = 

2 h n X y 

-2)'^. This symbol vanishes only for = -^2 = 0 (mod 2 tt). This means that the 
approximation to the differential equation is elliptic. It may be noted that 
9 p and 98 may be interchanged without having an effect on the ellipticity. It 
is also clear that such an approximation is of first order accuracy. However, 
even if the staggered grid approximation is of second order accuracy inside 
the computational domain, the boundary conditions are applied with first order 
accuracy. The total accuracy of both approaches (staggered and non-staggered) 
is of first order for the velocity components [ 8 ]. Moreover, when the non- 
linear (convective) terms are included (Re> 0) central differences (for these 
terms) may be used only if the cell Reynolds number. Re. = max( | Rehu [ , | Rehv | ) 
is less than unity. For higher Reynolds numbers the symbols of the approx- 
imations both on staggered and non-staggered grids is not elliptic. To pre- 
serve the ellipticity for all Reynolds numbers one has to use upstream app- 
roximations (usually of first order accuracy) to the convective terms (see 
e.g. reference 11 ). 


A result of approximation ( 6 ) is that the standard five point approxima- 
tion to the Poisson operator is satisfied if it is applied on the pressure 
found by solving system (5. a) with ( 6 ). That is, the pressure, in contrast 
to the velocity components, is computed with second order accuracy. 


The generalization of approximation ( 6 ) on non-staggered grids to the 
transformed equations ( 2 ) is done in a straight-forward manner. 
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THE MULTI-GRID CYCLING PROCEDURE 


For MG solution procedures one defines a sequence of M grids with mesh 
spacings h^....hj^ such that the finest grid has the spacing h|^^. Usually 

h. ./h|^ = 2 (1 <k <M). The MG cycling procedure for the solution of the sys- 
tem or the algebraic equations 

l-M P ° ^M 

on a grid with spacing hj^, is as follows: 

i. Relaxation sweeps are carried out (on the problem L. <p. = f, ), on 

the grid k until the convergence is too slow, k k k 

then either 

ii. The problem is transfered to a coarser grid, where new relaxation 
sweeps are done. This transfer (for the FAS-mode) is given by: 

Vi ~ ^k-1^k-1 '^k ^ ^k-1 ^^k " '-k '•’k^ 

where lj^_^ is the transfer operator from the fine grid (k) to the 
coarse grid (k-1 ) . 

Or if convergence to some accuracy has been obtained, then: 

iii. The correction is transfered to a finer grid. These corrections are 
smoothed out by relaxation sweeps (step i.). 

The procedure ends when the prescribed accuracy is attained on the finest 
grid. In the following sectionSj^we discuss some relaxation schemes (step i.) 
and proper transfer operators I|^_^(step ii.). 

THE RELAXATION PROCEDURE 

The purpose of the relaxation steps in any iterative solution process is 
to smooth out the errors. In a MG-procedure this smoothing process may be 
restricted to those Fourier components of the error which can be described 
on a given grid but not on coarser grids (high frequency components). The 
efficiency of the relaxation procedure(provided that no large-amplitude low- 
frequency errors are generated during the transfer among the grids) determines 
the overall efficiency of the MG solution procedure. 

When a system of difference equations approximating a system of diffe- 
rential equations is to be solved, the efficient relaxation of each equation 
(variable) does not neccessarily result in an efficient scheme for the system. 
This happens if by relaxaing one equation, new high frequency error compo- 
nents are introduced in the residuals of the other equations. A way of 
(almost) decoupling the relxation of the finite difference approximations to 
the continuity (l.c) and the momentum equations (l.a) and(1.b) has been sug- 
gested by Brandt and Dinar [2] for the staggered grid approximation. Here, 
a similar disributive Gauss-Seidel (DGS) relaxation scheme for the non-stag- 
gered grid approximation is described. 

We consider, first, the linearized Navier-Stokes equations (with frozen 
coefficients) in cartesian coordinates, discretized on a uniform grid with a 
mesh spacing h. The finite difference approximation can be written as: 
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(7. a) 


Qu - 8 p 


Qv - 8 p = 0 


8"u 

X 


8 ^ = 
y 


where Q = - Re (u3 + v3 ), and a and 

X y ^ X 


h ' X y' ' X y 

to the first derivatives of the convective terms 


(7.b) 

(7.c) 

are the upstream approximations 


Equations (7. a) and (7.0) can be relaxed by solving (pointwise or line- 
wise) for u and v, respectively. Efficient relaxation procedures have been 
developed for these equations both in cartesian and in general curvlinear- 
coordinates. 


The relaxation of equation (7.c) is more complex since if only u and v 
are changed, new (high frequency) errors are introduced in equations (7. a) 
and (7,b). This in turn means that the relaxation efficiency (of the high 
frequency error components) of the system may be very poor. If, on the other 
hand, for any function x> the dependent variables are changed (au, av and Ap) 
according to equations (8) then ^he residuals of the momentum equations shall 
not be altered (provided that Qa^ = a Q). 


and 


AU 


(8. a) 

AV 


(8.b) 

Ap 

o Qx 

(8.c) 


A particular choice of x> which is convenient, is to take it to be equal 
6 at the node point at which equation (7.c) is to satisfied, and zero else- 
where. Such a choice means that the velocity vector is changed at three node 
points while the pressure is changed at five node points simultaneously. 


Inserting equations (8) into (7.c) gives 

V^X = -(a^u + a^v) (9) 

h^ X y 

For our particular choice of Xj 

6 = (8®U + 3®v)/(4/h^) (10) 

X y 

If equations (8) are used then the residuals of the momentum equations 
are unchanged at all node points except at those adjacent to the boundaries. 
The reason for this is that at these points equations (8. a) and (8.b) are 
not valid (since u and v are specified on the boundaries, and they cannot be 
changed). The residual near the boundaries is changed by a factor 6/hJ which 
is of the same order as the original residual in the momentum equations. 


If the momentum equations are written in terms of the velocity comp- 
onents which are parallel to the transformed coordinates, one gets equations 
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of the form 


Q^u - 3^p = 0 
QzV - a^p = 0 

and i Qz in general (e.g. cylindrical coordinates). Under such circums- 
tances one cannot design a distributive relaxation scheme which will preserve 
the residuals of both momentum equations. In all such cases one has to use 
a common part of and Qz such that the change in the residuals in both eq- 
uations will be smooth ana of at most the order of the errors which are gene- 
rated near the boundaries. Such DCS relaxation schemes can be evaluated by 
local mode analysis for the system. 

An important aspect which must be considered together with the distribu- 
tive relaxations, is the accuracy in satisfying the compatibility condition 
(1.d). Again, we consider first the case of cartesian coordinates. Using the 
sided differences, as in (6); the compatibility condition can be written as 

z (a^u + a^v) h^ = z (u^ - UQ)h + z (v^ - Vz)h (11) 

i.j ^ ^ j i 

where the subscripts 0,1,2 and 3 denote the values of the dependent variables 
on the sides of the computational rectangle. It is clear that the right hand 
side of equation (11) must vanish if the compatibility condition is to be sa- 
tisfied. However, the question is what accuracy is tolerated in the numerical 
integration (11). Is it enough if the compatibility condition is satisfied to 
the truncation errors, or the accuracy must be that of the round-off errors? 

To answer this question we consider the system of the algebraic equations 
(for 6) which is obtained from equation (9). The sum of the terms on the left 
had side of these equations vanishes (to round-off). This means that the eq- 
uations of the system are linearly dependent. On the other hand, the right 
hand side equals to the left hand side of (11). This implies that in order to 
have a solution to the system of difference equations, the compatibility con- 
dition (11) must vanis (to round-off). A unique solution (for 6) can be ob- 
tained if the value of 6 is specified at some point. The corrections in the 
velocity components (8) which are equal to the derivatives of the 6 field, 
are not dependent on this prescribed value. 

If non-conservative form of the continuity equation, is used then the 
compatibility condition can be satisfied only to a certain accuracy (the trun- 
cation error). This in turn means that the DCS relaxations can converge only 
to a certain level. 

The numerical compatibility condition must be satisfied on all grids, and 
this must be taken into account when the velocity components are transfered 
to coarse grids. 


FINE TO COARSE GRID TRANFERS 

As discussed above the validity of the compatibility condition (to round- 
off error) is a condition for the existence of a solution. To satisfy the 
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I 


compatibility condition on the coarse grids to the same accuracy as on the 
fine one, the transfer from the fine to the coarse grids should be chosen 
carefully* 


The discretized counter part of equation (4) is 


( 12 ) 


where i,j denotes the node point (C.,n.) and h. is the mesh size of a grid 

J K 

with the label k. Since equation (12) is in conservative form, one gets that 
equals to the flux through the boundaries of the domain. That is, 

S, /h, = z [4-(C u + cv)]^+Z [4(n u + n v)]^ (13) 
k k . J X y 0 . J X y 2 

To satisfy the compatibility condition, the right hand side of equation 
(13) must vanish on the finest grid (k □ M). 


In the FAS-mode the residuals and the dependent variables are transfered 
to the coarse grids. The right hand side of the continuity equation on coarse 
grids should be compatible with the flux integral ( the sum in equation (13)). 
If the sum in equation (12) vanishes on the grid k, then it will vanish also 
on the grid k-1 if 


5 


k-1 


z m, 


k-1 


h 


k-1 


(14. a) 


and 


"k-1 


= [ 










(14. b) 


where the sum in equation (14. a) is over the indices of the coarse grid (k-1) 
and the sum in equation (14. b) completes the first sum so that all the node 
points of the fine grid (k) are covered. 


Equation (13) can be written as 

~ ^ *- 3^0 ^ *- 3^2 

J 1 

where u and v are the components of the velocity vector in the ^ and n direc- 
tions. A natural transfer of u and v from the grid k to the grid k-1, is by 
weighted averages ^with the Jacobian as weighting function: 

=k-i = \-i = < 

with the sums as in equation (14. b). If relations (16) are used then the 
fluxes through any closed region of the domain, are the same and are indepen- 
dent on the grid which is used for the computation of the fluxes. 


SMOOTHING FACTORS OF RELAXATION SCHEMES 

The efficiency of relaxation schemes can be estimated with good accuracy 
by using the method of local mode analysis [1-3,5]. Fourier components which 
are of interest from MG point of view have wave lengths which are of the same 
order as the mesh size. Other error components may be considered as slowly 


91 



varying and thus they do not contribute much to the variation in the residual 
in neighbouring points. The interesting Fourier components (of high frequen- 
cies) have short range of effect and therefore boundary effects may be neglec- 
ted. By local mode analysis it is ment that one considers how the amplitude, A, 
of a Fourier component exp(i ^i/h + i ^z/h) of the error is reduced by one 
relaxation sweep (carried out on the equation with frozen coefficients). The 
amplification factor of the amplitude is denoted by p = M(-^i,d2) and the 
smoothing factor y is defined as 

y = max |y(^i ,* 2 ) I 07) 

where the maximum is taken over the high frequency components (i.e. those com- 
ponents which can be described on a given grid but not on coarser ones). 

In the following we derive the amplification factor for some relaxation 
schemes for the momentum and the continuity equations. 


For the relaxation of the momentum equations we consider the following 
linear operator: 

'-m = 

where C. , C2, and C, are taken to be constants. For the Fourier component 
exp(i^iC/h +_i-&2n/h), let r be the amplitude of the residual before a relax- 
ation pass, r its amplitude after the pass, r the amplitude of the dynamic 
residual and A be the amplitude of the correction. 


For a pointwise Gauss-Seidel (Successive Point Relaxation, SPR) relaxa- 
tion scheme, the dynamical residual is given by 


where 


f = r + A[(C^ - RehH(C,)) ■. (C, - RehH(C^)) e 

= o! z<S 


The residual after the pass, r, is 


r = A[(C^ - RehH(-C^)) e^^' + (C2 - RehH(-C^)) e^^*]/h^ 


For SPR, A is solved from: 

T + A[-2(C^ + C 2 ) - RehdC^I + |C^|)] = 0 

which gives that .j-j. ^ RehH(-C,)] e^^‘ + [C„ + RehH(-C. )] e^^^ 


2(C^ + C2) - e - ^2 e + Reh NT 


(19) 

( 20 ) 


where 

NT = IC3I + |C^| + H(C3) e“^^i + H(C^) e~^^^ 

For cartesian coordinates and small cell Reynold_numbers , the smooth- 

ing factor equals that of the Poisson equation, i.e. y □ 0.5. For large cell 
Reynolds numbers the smoothing factor depends strongly on the sign of and 
(the relative direction of the relaxation sweep and the flow direction). 
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If the relaxation direction and the flow direction are alligned then the smoo- 
thing factor equals 1/3 for a wide range of values of C.j and (see Table I). 

If the relaxation direction is against the flow direction p is very close to 
unity. In general cases when the flow direction is not known in advance, one 
can make two relaxation sweeps in opposite directions. In^such a way the ave- 
rage smoothing factor, per sweep, would be less then (1/3)^. This type of double 
sweeps is called symmetrical relaxations. Difficulty arises also for small Re^ 
when C^>>C 2 or C^»C^, In these cases the relaxation efficiency is poor, but 


it can be improved if Successive Line Relaxations (SLR) are used. The amplifi- 
cation factor (?-) SLR applied on (18) (solving for each line 

C = constant), is: 


y (^1 9 ^ 2 ) 


- Reh H(-C^) 


2(C. + C„ - cos^J - C. + Reh NT 


where 


NT □ H(Cj)e 




!C3l + 


1 

- H(-C^)e^^2 + H(C^)e"^^^ 


( 21 ) 


The ^-SLR scheme is efficient if C^>0 (independent of C^) or for small 
values of Re^ if C 2 >>C (o-g. when the mesh size in the ti direction is larger 

than that in the ( direction). If on the other hand, C^>>C 2 , one can use 
n-SLR instead of ^-SLR. In general cases, when the computational field con- 
tains large variations in C^/C^, good smoothing factor is obtained if ^-SLR 
is followed by n-SLR. This type of relaxation is called Alternating Direction 
SLR (AD-SLR). C-SLR is efficient if C^>0 (independent of C^) but as the cell 
Reynolds number increases and if the flow and the relaxation directions are 
against each other (C^<0), then p is close to unity (see Table I). General 
flow fields can be relaxaed by symmetrical SLR sweeps. 


To improve the simple SLR for cases of general geometry, a modified SLR 
has been tested. The basic idea is to introduce a term for the correction 
problem which simulates a high Reynolds number flow in the relaxation direc- 
tion. One way to achive such a term, implicitly, is to add part of the correc- 
tion to the approximation on the line just up-steam the line which is being 
updated. In some sense the method resembels distributive relaxations even 
though the purpose and the motivations are completely different. Since the 
term which is added to the correction probelm has a convective character, the 
scheme is called Convective Successive Line Relaxation (C-SLR). 


In C-SLR like in SLR, each line (e.g ^=constant) is being updated simul- 
taneously. The correction at each point j (on the line i) is 6. and a6 . is 
added to the variables at the node point i-1,J. If a = 0, then'^C-SLR i^ iden- 
tical to SLR. 


A local mode analysis for C-SLR (solving each line ^=constant) gives that 
the amplification factor is given by: 


p (^ 1 , ■9*2 ) = 


C^(1 - 2o + 2C2a(cosS-2 - 1) - RehNTI 


C^(a - 2) + 2C2 (cos^ 2 - 1) + C^e 


-1^ 


1 - Reh NT2 


( 22 ) 


where 
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NT1 = 


NT2 = 



, if 

C3>0} 

IC^ad - 

-i'&o 

e ^ 

), if 

CD 

A 

C_) 

IC^d ■ 

- a + ae^^* ) , if 

+ 

C^<0} 

{C^a(e^* 

^ - 1) 

, if 

C^<0} 

103(1 ■ 

- a) - e if 

C3>01 

ic^d - 

— i'Q’ 2 \ 

, if 

C^>0} 


» if 

+ 

C3<0> 


- 1) 

» if 

C^<0} 

compare 

the smoothing factor of 

the SPR, 

C-SLR 

(C-SLR 


with a = 0), C-SLR with near optimal value of a (a ), and C-SLR with a 
fixed value of a (a= 0.25). In Table I. a the relaxation direction is alligned 
with the flow direction (C^ = = 1). In Table I.b the results are for the 

case when the relaxation and the flow directions are against each other. 


TABLE I. - SMOOTHING FACTORS FOR OPERATOR (18) 
Table I.a: = 1, = 1 


Re. 

h 

C/C3 

SPR 

SLR 

a=0 

C-SLR 
M a 

C-SLR 
a = .25 





no 



0.05 

0.909 

0.447 

0.277,0.25 

0.277 


0.10 

0.835 

0.447 

0.277,0.25 

0.277 


0.50 

0.567 

0.477 

0.254,0.30 

0.277 

0 

1.00 

0.500 

0.447 

0.254,0.30 

0.277 


5.00 

0.721 

0.714 

0.333,0.50 

0.565 


10.00 

0.835 

0.833 

0.467,0.60 

0.737 


50.00 

0.962 

0.962 

0.855,0.60 

0.937 


0.05 

0.333 

0.036 

0.036,0.00 

0.340 


0.10 

0.333 

0.068 

0.068,0.00 

0.323 


0.50 

0.333 

0.243 

0.159,0.10 

0.224 

1 

1.00 

0.333 

0.333 

0.154,0.25 

0.154 


5.00 

0.525 

0.620 

0.256,0.40 

0.420 


10.00 

0.670 

0.767 

0.391,0.60 

0.632 


50.00 

0.909 

0.949 

0.785,0.60 

0.907 


0.05 

0.333 

0.004 

0.004,0.00 

0.493 


0.10 

0.333 

0.008 

0.008,0.00 

0.488 


0.50 

0.333 

0.038 

0.038,0.00 

0.453 

10 

1.00 

0.333 

0.072 

0.072,0.00 

0.412 


5.00 

0.333 

0.254 

0.119,0.20 

0.188 


10.00 

0.333 

0.414 

0.160,0.30 

0.162 


50.00 

0.671 

0.796 

0.404,0.60 

0.668 


0.05 

0.333 

0.000 

0.000,0.00 

0.538 


0.10 

0.333 

0.001 

0.001,0.00 

0.538 


0.50 

0.333 

0.004 

0.004,0.00 

0.533 

100 

1.00 

0.333 

0.008 

0.008,0.00 

0.527 


5.00 

0.333 

0.038 

0.038,0.00 

0.481 


10.00 

0.333 

0.073 

0.073,0.00 

0.429 


50.00 

0.333 

0.277 

0.127,0.20 

0.185 
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Table I.b: 

C 3 0 - 1 , 

®4 = 


Re, 

n 

c^/c^ 

SPR 

SLR 

cc =0 

_C-SLR 

y a 

no 

C-SLR 

a=.25 


0.05 

0.821 

0.954 

0.674,0.50 

0.752 


0.10 

0.807 

0.914 

0.644,0.50 

0.717 


0.50 

0.724 

0.728 

0.494,0.40 

0.550 

1 

1.00 

0.667 

0.632 

0.418,0.40 

0.460 


5.00 

0.718 

0.663 

0.347,0.50 

0.513 


10.00 

0.797 

0.784 

0.447,0.60 

0.672 


50.00 

0.945 

0.944 

0.800,0.60 

0.909 


0.05 

0.953 

0.995 

0.717,0.40 

0.786 


0.10 

0.951 

0.990 

0.714,0.40 

0.782 


0.50 

0.934 

0.954 

0.685,0.40 

0.752 

10 

1.00 

0.914 

0.914 

0.664,0.40 

0.717 


5.00 

0.839 

0.728 

0.530,0.40 

0.558 


10.00 

0.807 

0.638 

0.439,0.40 

0.481 


50.00 

0.847 

0.825 

0.606,0.60 

0.739 


0.05 

0.995 

1 .000 

0.721 ,0.40 

0.790 


0.10 

0.995 

0.999 

0.720,0.40 

0.790 


0.50 

0.995 

0.995 

0.717,0.40 

0.786 

100 

1 .00 

0.993 

0.990 

0.714,0.40 

0.782 


5.00 

0.970 

0.954 

0.718,0.50 

0.752 


10.00 

0.951 

0.914 

0.670,0.40 

0.717 


50.00 

0.867 

0.729 

0.534,0.40 

0.562 


From Tables I it is clear that ^-SLR is most efficient for large values 

of Re^ if the flow and the relaxation directions are alligned. In general 

cases of coordinate transformations C-SLR with a give better results than 

no ^ 

SPR or SLR. Even with a constant, non-optimal a □ 0.25, the results are better 
than with SPR or SLR (especially for moderate and large Re^ and = -1). It 
is also noted that for operator (18) the C-SLR needs only marginally more comp- 
utational effort compared with SLR. For this reason one can expect real, in 
terms of computational times, improvement in the efficiency when C-SLR is used. 


The smoothing efficiency of both pointwise DOS (P-DGS) and line DG5 
(L-DGS) relaxation schemes are cosidered for an equation of the form: 

(as® + bs®)u + (cs® + ds®)v o 0 
^ n K r\ 

where a,b,c and d are constants. 


Local mode analysis has been carried out for both cases-The amplification 
factor for the P-DG5 scheme is 


y (^1 ,^2 ) 


(aB + cy) e^ ^ + (b 3 + dy) e^ 


(aB + Cy) e ^ - (bB + dy) 




(24) 


where 


a = 2(3“^ 


+ d"^ + ab + cd) 
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and 


6 = a + b 
Y = c + d 


For the case of rectangular coordinates, (with a = b=1,c = d = 0), one 
gets back the smoothing factor for the Poisson equation. The efficiency of 
the P-DGS decreases somewhat when all the coefficients are of the same order. 
The smoothing factor is then equal to 0.632. The worse case is when either 
a or d are much larger than the other coefficients. In such cases improvement 
is obtained if L-DGS is used. The amplification factor of L-DGS (solving for 
a line C=constant) is 


p(^l ,^2 ) 


ag + CY 

a - 2(bg + dY)cos ^2 - (ag + cy) 



(25) 


This L-DGS is most efficient when d is dominant, while L-(n)DGS is most 
efficient when a is dominant. Alternating direction of L-DGS relaxation sweeps 
can be used in those cases when the coefficients vary largely in the computa- 
tional domain. 


COMPUTATIONAL RESULTS AND DISCUSSION 

At this stage of our work we have tested mainly the efficiency of the 
relaxation schemes for the momentum equations. These studies are of interest 
for some coordinate transformations which may be used for practicle problems. 
Results have also been obtained for the Navier-Stokes equation solver, but 
these results are of preliminary character, since the transfer among the grids 
has not been done properly. 

In reference 3 a user oriented MG-program for the solution of the Poisson 
equation in general two dimensional coordinates, is presented. The relative 
efficiency of several relaxation schemes, applied to the Poisson equation in 
linearly and exponantially stretched coordinates, have been compared. In that 
work [3] it has been concluded that: 

i. The rate of convergence of different relaxation operators is not 
sensitive to the number of the net points, 

ii. As predicted by local mode analysis, simple n- or C-SLR are not effi- 
cient on grids where max |C^/C 2 l >>1 and min |C^/C 2 l<< 1 . 

iii. The AD-SLR method has a rate or convergence which is not sensitive 
in variations in 

iv. The C-SLR scheme is superior to usual SLR and AD-SLR schemes except, 
possibly, when the stretching is very large. In such cases C-AD-SLR 
is recomended. 

These tests are valid also for the Stokes (Re = 0) momentum equations. 
Recently, the tests have been extented to cover the momentum equations for 
non-vanishing Reynolds numbers. The following coordinate transformations have 
been considered: 

1. The identity transformation. ? = x,n = y. 

2. Mesh refinements near the boundaries of a unit square; 
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where 


X = atgh(C -0.5)/e + 0.5 
y □ atgh(n -0.5)/e + 0.5 
a =1/(2tgh 0.5/e) 

3. Exponentially stretched coordinates (as in reference 3); 

X = 5 + e"^^ 

y = n + 

4. Polar coordinates: 


X = r cos^ 
y □ r sin^ 

and r □ ^ + 1 and ^ = n . 

5. Polar (stretched) coordinates, as in 4, but with: 

r □ 1,3 +atgh(^ - 0.5)/c 
^ = cttghCn - 0 . 5 )/£ + 0.5 


Known functions, of order one, are chosen and they define a forcing term 
in the governing equations in such a way that the exact numerical solution 
equals these functions. Only MG-cycling is used (starting always with zero 
initial approximation) and the iterations are continued to an accuracy beyond 
the truncation errors. In this way the asymptotic convergence factor 0 is ob- 
tained. 


In the following tables the asymptotic convergence factor 0, the comp- 
utational time, T, and the absolute error in the solution, E, are given, for 
the following cases: 

i. Results for transformation 2 (e □ 1/3) are given in Table II. 

ii. Results for transformation 3 (c □ 1) are given in Table III. 

iii. Results for transformation 4, are given in Table IV. 


TABLE II: TRANSFORMATION 2. 


1/h 

Re 

8 

0 T E 

16 

0 T E 

32 

0 T E 

0 

il 

100 

0.81 ,0.12, 6E-3 
0.79,0. 16, 5E-3 
0.70,0. 10, 2E-3 
0.25,0. 03, 9E-5 

0.86,0. 43, 6E-2 
0.86,0. 45, 6E-2 
0. 84,0.48, 2E-2 
0.61 ,0.22, 6E-4 

0.88,1. 67. 2E-1 
0.88,1.71 ,2E-1 
0.88, 1.71, IE-1 
0.80,1 .50, 5E-3 

0 

SLR ,J 
100 

0.75,0. 13, 8E-3 
0.74,0. 14, 6E-3 
0.63,0. 09, 3E-3 
0.29,0. 04, 3E-4 

0.83,0. 56, 5E-2 
0.82,0. 56, 5E-2 
0.80, 0.22, IE-2 
0.57,0. 24, 3E-4 

0.84,2.20,2E-1 
0.84,2.27,2E-1 
0.84, 2. 24, IE-1 
0.75,2.04,3E-3 

0 

C-SLR 

100^ 

0.56,0. 07, 6E-4 
0.54,0. 06, 3E-3 
0.34,0. 03, 5E-4 
0.29,0. 03, 3E-4 

0.64,0. 37, 8E-4 
0.64, 0.40, IE-3 
0.59, 0.29, IE-3 
0.57,0. 26, 3E-4 

0.68,1 .92, IE-3 
0.68, 1.92, IE-3 
0.64,1. 57, 9E-4 
0.75,1 .75, 3E-3 


0 

The convectivity coefficient in C-SLR is taken to be zero. 


I 
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TABLE III: TRANSFORMATION 3. 


1/h 


8 

16 

32 


Re 

0 T E 

0 T E 

0 T E 


0 

0.56,0. 05, 4E-3 

0.57,0. 17, 2E-3 

0.60,0.91,3E-4 

SPR 

1 

0.58, 0.07, IE-3 

0.59,0. 23, 2E-3 

0.61,0. 89, 2E-3 

10 

0.48, 0.03, IE-3 

0.59,0. 19, 2E-3 

0.66, 1.00, IE-3 


100 

0.26,0. 03, 9E-5 

0.26,0. 08, 5E-4 

0.50,0. 65, 8E-5 


0 

0.50,0. 05, 2E-3 

0.60, 0.33, 5E-4 

0.64,1. 56, 2E-3 

SLR 

1 

0.51 ,0.05, IE-3 

0.60, 0.32, IE-3 

0.65, 1.58, IE-3 

10 

0.36,0. 04, 3E-4 

0.52,0. 21, 6E-4 

0.60,1. 28, 3E-4 


100 

0.31,0. 03, 2E-4 

0.21 ,0.09, 5E-4 

0.33,0.63, 5E-5 


0 

0.41 ,0.04, IE-3 

0.46,0.21 ,5E-4 

0.45,0.90,8E-4 

C-SLR 

1 

10® 

0.53, 0.04, IE-3 
0.36,0. 03, 3E-4 

0.52,0. 25, 6E-4 
0.52,0. 21, 6E-4 

0.52,1. 33, 4E-4 
0.60,1 .36, 3E-4 


100® 

0.31,0. 04, 2E-4 

0.21,0.11 ,5E-4 

0.33,0. 68, 5E-5 


TABLE IV: TRANSFORMATION 4. 


1/h 


8 

16 

32 


Re 

0 T E 

0 T E 

0 T E 


0 

0.62,0. 06, 3E-3 

0.64,0. 27, IE-3 

0.67,1 .15, IE-3 

SPR 

1 

0.62,0. 06, 3E-3 

0.63,0. 23, 5E-3 

0.66, 1.17, IE-3 


10 

0.54,0.06, IE-3 

0.61,0. 21, 2E-3 

0.64, 1.04, IE-3 


100 

0.24,0. 01, 3E-3 

0.45,0. 12, 2E-4 

0.56,0. 77, 3E-4 


0 

0. 56,0.06, 2E-3 

0.63,0. 34, 6E-4 

0.68, 1.67, IE-3 

SLR 

1 

0.56,0.06,4E-3 

0.63,0. 32, 2E-3 

0.68,1. 78, 8E-4 

10 

0.54,0. 06, 2E-3 

0.64,0. 40, 9E-4 

0.66,1 .67, 9E-4 


100 

0.23, 0.02, IE-4 

0.40, 0.17, IE-4 

0.54,1 .08, IE-4 


0 

0.62, 0.06, IE-3 

0.43, 0.22, IE-3 

0.46,0. 93, 2E-3 

C-SLR 

1 

0.46, 0.04, IE-3 

0.46,0. 22, 2E-3 

0.44,1. 14, 5E-4 

10 

0.32,0. 02, 3E-4 

0.51 ,0.25,3E-4 

0.54,1. 21, 7E-4 


100® 

0.23, 0.03, IE-4 

0.40, 0.19, IE-4 

0.54, 1.12, IE-4 


^The convectivity coefficient in C-SLR is taken to be zero. 

It is noted from Table II that the SPR and the SLR schemes do not conver- 
ge on the finest grid. The C-SLR reults even in this cases in an acceptable 
convergence factor. For the C-SLR only three different values of the parameter 
a are used in the actual computations, and hence the C-SLR has not been op- 
timal. Furthermore, all the results (in Tables II-IV) have been obtained with 
the same values of the MG control parameters, which are nearly optimal for 
SPR in cartesian coordinates. Improved results for C-SLR are obtained when 
other MG control parameters are used. The C-SLR gives a real computational 
gain (not only in terms of convergence factor, but also in terms of computa- 
tional times) in general, and especially in cases similar to Transformation 2. 
For simple cartesian cases, however, SPR results in the shortest computational 
times . 
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The Navier-Stokes solver has been tested for the computation of the flow 
in rectangular-and polar-driven cavities. The code does not include, yet, the 
correct transfer of the dependent variables to coarse grids, and hence one can 
expect a reduction in the total efficiency. The problem has been solved on 
several grids, and the convergence factor is found to be about the same for 
both cavities and the different grids. A comparison of these preliminary re- 
sults for the (non-optimal ) C-SLR and the SPR schemes is given in Table v. 



TABLE V: 

CONVERGENCE FACTORS 
DRIVEN CAVITIES. 

Re 

C-SLR 

SPR 

0 

0.78 

0.76 

1 

0.74 

0.84 

10 

0.72 

0.83 


As expected the C-SLR gives somewhat better results than the SPR scheme. 
However, the results are not satisfactory and effort is being made to improve 
the overall effieciency of the Navier-Stokes solver. 

CONCLUDING REMARKS 

The progress which has been made in developing a general purpose two 
dimensional Navier-Stokes solver is reported here. The computational code 
which gives good results, is not considered to be in its final form and addi- 
tional improvement is expected. Further work is being done on the extehsion 
of the method to problems which include planes of symmetry where mixed boun- 
dary conditions are to be applied. 
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APPLICATION OF THE MULTI-GRID METHOD TO CALCULATIONS OF 
TRANSONIC POTENTIAL FLOW ABOUT WING-FUSELAGE COMBINATIONS* 

Arvin Shmilovich and D. A. Causey 
Cornell University 


ABSTRACT 

The Multi-Grid (MG) method has Been applied to the calculation of 
transonic, potential flowfields about arbitrary, three-dimensional, wing- 
body combinations. Numerical results for iterative convergence rate are in 
good agreement with those predicted by a local mode analysis, and show sub- 
stantial improvement over conventional relaxation algorithms. 


NOMENCLATURE 


a = speed of sound k 

a,b,R = normalization parameters for M 
the angular coordinate ( equa- M(, 

tion ( 19 )) P 

A,B,C,D,E,H = coefficients of second P,Q,R 
derivatives for the potential P,Q,R 
equation (28) in the trans- 


formed cylindrical coordinates q 
~ re coupling coefficients Rf^Nt 

A^ jA^^ = cell aspect ratios in q and r 

directions, relative to the 
width of cell in 5 direction, s 
respectively 

c = chord length S 

Cp = pressure coefficients 

d = dimension of the problem T 

F^ = forcing function u,v,w 

gij = elements of inverse of the 

metric tensor U,V,W 

G = growth factor 

Ck^ = kth level 

"5,0 = maximum growth factor on finest 

grid per iteration and per MG 
cycle , respectively 

h = mesh spacing _ 

H,h = Jacobian of the transformation Wj^ 

and its determinant, respec- 
tively 

I = interpolation operator x,y,z 


ratio of specific heats 
Mach number 
"cut off" Mach number 
wave number 

artificial viscosities fluxes 
terms used for constructing 
the P,Q,R fluxes, respectively 
velocity vector 
radial coordinate of fuselage 
surface and wing tip , 
respectively 

coordinate tangent to stream- 
line 

width of strip in conformally 
mapped plane (figure 2b) 
re coupling term 
velocity components in x,y,z 
coordinates , respectively 
contravariant velocity compo- 
nents in X,Y,Z coordinates, 
respectively 

work spent of the k*^^ MG level 
to reduce the residuals to 
within the truncation error of 
the k*^^ grid 

total work (estimated) for ob- 
taining a solution to the 
level of the truncation error 
Cartesian coordinates 


* This work has been supported by the Office of Naval Research under 
Contract N00011+-77-C-0033. 
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XjT,© = cylindrical coordinates 

= normalized cylindrical coor- 
dinates 

XjYjZ = coordinates of the computa- 
tional domain 

a = modes (equation (29)) 

6 = central difference operator 

c = recoupling parameter 

y = artificial viscosity switching 

function (equation (l6)) 
y = averaging operator 

g,n = coordinates in mapped cylindri- 

cal plane (equation (20)) 
p = density 


p g = weighting coefficients for 
the residual transfer (fig- 
ure Ua) 

T = truncation error 

(|) = velocity potential 

oi = overrelaxation factor 

Subscripts 

( )oo = value at upstream infinity 
( ) = coordinates of the singular 

^ line 


I . INTRODUCTION 

The MG method has been shown effective in speeding up the convergence 
of relaxation solutions of the difference equations arising from discrete 
approximations to problems of elliptic type (refs. 1-5)* Less attention 
has been focussed on non-elliptic problems. The advantages of the method 
for problems of mixed type have been demonstrated by South and Brandt (ref. 
6), who treated the two-dimensional transonic small disturbance equation for 
the non-lifting flow past a parabolic arc airfoil. Substantial deteriora- 
tion in performance of the MG method has been encountered when using suc- 
cessive line overrelaxation (SLOR) on stretched grids. Jameson (ref. T) 
applied the MG method to the two-dimensional potential equation using an 
alternative-direction-implicit (ADl) smoothing algorithm, and obtained good 
rates of convergence even on highly stretched grids. 

The extension of the MG method to three-dimensional calculations seems 
attractive, since the process of eliminating the persistent low frequency 
components of the error using conventional relaxation techniques is expen- 
sive, and the work required for a relaxation sweep on a coarser grid is only 
1/8 of that on the preceding grid when the grid spacing is doubled in each 
direction. Recent work on an MG code for three-dimensional transonic flow 
about axi symmetric inlets has been reported by McCarthy and Reyhner in 
reference 8. 

An existing three-dimensional transonic potential code designed by 
Caughey and Jameson has been modified to accommodate the MG procedure. The 
code is called FLO 30, and solves a fully conservative difference approxima- 
tion in a boundary-conforming coordinate system. 

Experience gained in two-dimensional numerical calculations, both from 
the programming and predictability aspects, guided us in carrying out the 
work reported herein. An attempt is made to predict the performance of the 
accelerated iterative scheme by means of the local mode analysis. By an 
a priori knowledge of the rate of convergence, a stopping criterion can be 
established. 
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In the following sections we describe the finite volume method and the 
grid generation procedure (refs. 9“H) to the extent that is needed for 
xinderstanding the features associated with the incorporation of the MG tech- 
nique into the existing code. The MG procedure is reviewed as well, empha- 
sizing the direct implications by those aspects relevant to the problem 
under consideration. Theoretical estimates of the MG performance on rather 
complicated grids are discussed. Nixmerical calculations demonstrating the 
beneficial effect of the MG technique in accelerating the original relaxa- 
tion scheme are presented, and the validity of the theoretical estimates is 
confirmed. 


II . MALYSIS 


A. Finite Volume Method 


A detailed exposition of the finite volume method devised by Jameson 
and Caughey may be found in reference 9- 


The continuity equation for steady-inviscid, isentropic flow in Car- 
tesian coordinates x,y,z reads 




(p<l> ) 

y y 




= 0 , 


( 1 ) 


where (|) is the velocity potential. The density p is given by the isentropic 
relation 


P 


(1 + 


k-1 

2 




1/k-l 

5 


(2) 


where k is the ratio of specific heats and M^^^ is the free stream Mach num- 
ber. The description of the velocity *q = (u,v,w) in terms of a scalar po- 
tential 


q = V({), (3) 

is a consequence of the assimiption that zhe flowfield contains no strong 
shocks, so that the flow may be regarded as being irrotational . 

The finite volume method does not require knowledge of the global na- 
ture of the transformation which generates the grid network, but uses only 
local properties of the transformation. We introduce an arbitrary trans- 
formation to a new coordinate system X,Y,Z and define the Jacobian H 



with its determinant 
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h = det(H) . 


The contravariant components U,V,W of the velocity can be expressed as 



T 

where H H represents the metric tensor. Under the transformation, the 
continuity equation becomes 

(phU)^ + (phv)^ + (pliW)^ = 0. (6) 


It is convenient to utilize a transformation that locally converts an 
arbitrary cell in the physical space into a cube in the computational domain, 
such that its center is located at the origin and its vertices are at a dis- 
tance lanity apart . 7 




The simplest such mapping assiomes a trilinear variation of the coordinates 
and the potential, within each cell. Thus, the shape function for the x 
coordinate, say, is 

g 

x = 8 I x.(^ - XX.)(^ - YY.)(^ - ZZ.), (7) 

i=l 

i denoting the i*^^ vertex of the cell. It can be verified that such formu- 
las yield expressions for the derivatives such as 



(Xg-Xi+Xj^-Xs+Xg-X^+Xg-X^) 


( 8 ) 


when evaluated at the centers of the mesh cells. Thus, the Jacobian and 
the contravariant velocity vector may be readily calculated. For the sake 
of simplifying the notation, let us introduce the averaging and differencing 
operators 

" 2 ^h+l/2,j,k h-l/2,o,k^ 
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( 9 ) 


With this notation, the approximations for the derivatives can he written as 

with similar expressions for derivatives of y,z,(|) and of derivatives in the 
other directions. Taking a second difference of the contravariant fluxes, 
the continuity equation is cast in the form 

p^^^^Cphu) + = 0, (11) 

This formula can he interpreted as conserving mass fluxes in an auxiliary 
cell which overlaps eight primary cells, having vertices at the centers 
of the primary cells . 

Since the fluxes are calculated hy averaging the values of the cell 
centers rather than using values evaluated at the face centers of the axix- 
iliary cells, this formulation tends to decouple the solution at alternate 
points of the grid. In order to compensate for this decoupling we effec- 
tively shift the point of evaluation of the fluxes to the center of the 
faces of the auxiliary cell hy adding one term of their expansions about the 
centers of the mesh cells. The added term is of the form 

+ V1y'S2X^-^Z''’'Sc^'^Y^ZX " 2 ’^XYZ^\'^'^''’\^^XYZ^‘t’ ’ 

11 2 2 ^ 

where = ph(g - U /a ) and similar formulas hoid for Here a is 

the speed of sound. The gij are the elements of the inverse of the metric 
tensor, and 0 <_ e 1/2. In practice e = l/2 is generally used, repre- 
senting the strongest recoupling. 

In order to properly reflect the correct domain of dependence in super- 
sonic regions of the flowfield, it is necessary to introduce an artificial 
viscosity. Since the local flow direction is not known in advance, and we 
want the directional hias to he added in the streamwise direction, we make 
use of Jameson's rotated scheme (ref. 12). Consider the potential equation 
in quasilinear fonn written in coordinates locally aligned with the velocity 
vector 

(a^ - q^)<|>S3 + a^(v2 - <(,^^) = 0 , (I3) 

where s is a coordinate tangent to the streamline. The upwinding of the 
term can he accomplished hy explicitly adding an appropriate artificial 
viscosity to the central difference approximation. The addition of such a 
term in divergence form can he shoT^m to he analogous to the following modi- 
fied numerical scheme 

(phU+P)^ + (phV+Q)^ + (phW+R)^ + T = 0. (lU) 
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The added fliix P is constructed from 


P 


li 


*xx * 


UVU 5 
^XY XY 


WUV 5 ) 

ZX^ 


by defining 


i+l/2,o,k 


if U < 0 . 


(15) 


The fltLxes Q and R are constructed in an analogous fashion. Here, a switch- 
ing function y has heen introduced 

m2 

y = max(0, 1 - , (l6) 

q 


such that the directional bias is added in regions where the local speed 
exceeds the value of some "cut off" Mach number Mq* It has been observed 
that the MG technique functions at its best when the upwinding is switched 
on even in a region slightly larger than the supersonic pocket. In prac- 
tice = .9 is generally used. 

The solution of the nonlinear algebraic equations resulting from the 
discretization is accomplished by the formulation of an iterative scheme, 
embedding the steady state equation in an artificial time -dependent equa- 
tion. 


B. Grid Generation 


The major difficulty in treating the full potential equation rather 
than its small perturbation approximation is to correctly satisfy the 
boundary conditions. This can be done easily if the difference equations 
are solved in a boundary conforming coordinate system. An essential advan- 
tage of the finite volume method is the decoupling of the grid generation 
step from the iterative procedure, since only local properties of the 
transformation are used. Nevertheless, it is often convenient to generate 
the coordinate grid by sequences of conformal and shearing transformations 
for a variety of practical problems. 


Consider the configuration shown in figure 1, consisting of a wing 
mounted upon a fuselage of varying cross sectional shape. We assume the 
flow is symmetrical about the vertical plane passing through the fuselage 
centerline so that only the flow in half space z ^ 0 need be considered. 


Denoting the fuselage surface by Rp(x,0), the radial coordinate is 
normalized out by defining 


r 


r - R^(x,e ) 


(IT) 
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being the radial coordinate of the. wing tip. The wing sweep and dihedral 
are normalized out by introducing the coordinates: 


X - X /--V 


(r) 


where 


? = 2(b J /R^-(6-a)2) 

20^ - TT^ 

a = — Te = 


(18) 


= (0 -a)^ + 

s ^ 


(19) 


and the sign is taken depending upon whether is positive or negative. 

Here is the local chord length, and Xg(^), ^s(F) represent the loca- 

tion or a singular line just inside the lea(iing edge of the wing. The 
singular line is then used as a branch point in the conformal mapping 

X + 2i0 = log(l - cosh(5+in)) (20) 

to "unwrap" the wing surface. Under this transformation a surface of con- 
stant F that intersects the wing surface, shown in figure 2a, will take 
the form depicted in figure 2b. A final shearing transformation 


X = C 



reduces the strip of width S/r Y) constant width, as shown in 

figure 2c, resulting in a nearly orthogonal mesh if the location of the 
singular line has been carefully chosen. The computational domain shown 
schematically in figure 3 is rendered finite by suitable stretchings in the 
X and Z directions. The X stretching is set up at each spanwise station, 
so that the far downstream boundary remains a plane in the physical domain, 
even if the wing is tapered and/or swept. 

The definition of R f(x,g) ^ (? r) input data of the fuse- 

lage and wing geometry is achieved by spline fits. A spline fit in x is 
applied for interpolating the coefficients of the Fourier decomposition of 
the fuselage cross sections. Spline fits in the spanwise and S directions 
are employed to define S(r ^) . Having defined R^ and S, we proceed with 
calculating the physical coordinates of the grid points by reversing the 
mapping sequence. 


C . Boundary Conditions 

The treatment of boundary conditions in a boundary conforming coordi- 
nate system is quite easily accomplished since the finite difference scheme 
is formulated in terms of the contravariant velocity vector (UsV,^). To 
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enforce the no flux condition across the solid surfaces (the fuselage and the 
wing) the normal component of the velocity vector must he zero. This is 
incorporated by reflecting the normal flux contributions for the cells 
adjacent to the boundary. 

The algorithm is simplified by introducing a reduced velocity potential 
representing perturbations from the free stream. This potential is set to 
zero on the upstream and lateral far field boundaries. On the far do™- 
stream boundary, the first derivative in the streamwise direction of the 
perturbation potential is set to zero; this is equivalent to setting the 
streamvise velocity component to its free stream value. This boundary 
condition is a consequence of the fact that the flow becomes fully developed 
for downstream. 

To account for lift, a vortex sheet emanating from the trailing edge 
of the wing must be allowed. It is assumed that the trailing vortex sheet 
lies along the branch cut (the dashed line in figure 2b) , thus convection 
and roll-up are ignored. On the two sides of the sheet we require that 
mass be conserved. For this purpose it is convenient to introduce dummy 
points above the boundary, which are identified in the physical domain with 
corresponding interior points on the other side of the cut. To envision 
this, imagine rotating the left branch cut (in figure 2b) in the counter- 
clockwise direction by l 80 °, about the singular line, thus obtaining the 
physical plane in figure 2a. For conserving the mass in cells whose cen- 
ters lie on the cut we calculate the contribution to the fluxes at the 
centers of the dummy mesh cells, from the values of the potential arid the 
coordinates of the corresponding cells reflected about the origin. Points 
on both sides of the cut are treated as interior points by the same itera- 
tive algorithm. This procedure can also be used when the cut is a vortex 
sheet across which the jump in the potential is determined by the Kutta 
condition. 

In the original program it was required that the normal velocity com- 
ponent be continuous across the vortex sheet: Vy = 0. This condition was 

applied also on the branch cut outboard of the tip of the wing. (This por- 
tion of the cut has no physical meaning.) In the modified code mass is con- 
served on points that lie on the vortex sheet inboard of the wing tip, while 
Vy = 0 was posed on the outboard part of the cut. This special treatment of 
the branch cut provided best results when using the MG algorithm. 

D. Multi-Grid Technique and Implementation Aspects 

An extensive discussion of the MG technique by Brandt in reference 5 
is very illuminating. In addition, we suggest other relevant references 
(refs. 3s h) as background to our rather succinct presentation. 

The MG method relies on the fact that relaxation schemes are efficient 
in eliminating those components of the error whose wavelengths are com- 
parable to the mesh spacing. However, the process of liquidating the lower 
frequency modes is characterized by a slow rate of convergence since the 
effective signal speed on the fine grid is slow. The basis of the MG method 
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is to discretize the prohlem in a sequence of grids of different mesh 
widths, allowing simultaneous treatment of the whole spectriim of error 
modes. This greatly speeds up the convergence of the relaxation scheme. 
Moreover, solving on coarser grids requires far less computational effort, 
since the mesh points are fewer. 


We proceed to describe the logical sequence of the MG procedure. De- 
note the discretization of the scheme in equation (l4) on a hierarchy of 
grids G^, G^, G^,..., G^ of varying coarseness, by 

= F^, k = 0,1,2, ... ,K , (22) 

K designating the finest grid, so that F =0. We start the iteration on 
the finest grid with the aid of some initial estimate. When slow conver- 
gence is sensed, the relaxation process is discontinued on this grid. Ex- 
ploiting the smoothness of the tentative solution obtained, we carry out 
relaxation on the next coarser grid. While <j)^ is an approximate solution on 
G^, it cannot be expected to be a good approximation on G^”l, because of 
differences in the discretization errors of the two grids. The link between 
the grid levels is made by using a forcing term which accounts for the 
difference between the truncation errors of the tv7o grids. Thus 


where 




„k-l .^k-1 „k k-1 
F =1, F + T, 
k k 


(23) 

(2U) 


^ k-1 

and T 

grid,^ 


is the truncation error of the coarse 


grid relative to the fine 


k-1 _k-l,_k-l,k> _k-l_k,k 

\ = L (I^ ♦ ) - L ♦ . 


(25) 


k— 1 

Here I denote interpolation operators from the fine grid to the next 
coarsest level. It should be noted that (F -L (}) ) is the residual left 

by (j)^. The operator for the residual interpolation is not necessarily the 
same as that for the solution transfer. 


Having calculated an approximate solution on the coarse grid, an up- 
dated solution of the fine grid may be obtained. Simply interpolating 
to the fine grid, however, cannot be done, since this would cause the high 
frequency components of the solution to be lost. These components can be 
maintained by adding to the recent solution the contribution of low 

frequency components to the correction, namely the difference between the 
updated solution on the coarse grid and its estimate Thus, 

an improved solution on the fine grid is 


Note that 
as 


4 > 


new 


^ ^k-i 


(^k-i _ jk-i^k) ^ 


(26) 


(J) needs to be saved. Alternatively, equation (26) can be written 


.k - *k-l . / ,k k ^k-1 kv 
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and the updated solution can be interpreted as follows: transfer the updated 

solution (j)^ to and interpolate it back to the fine grid ( now 

contains only low frequency components) ; subtract the result from to 
form the contribution of high frequency components to the correction. 


Error components of arbitrarily low wavenumbers can be diminished by 
extending the above sequence onto yet coarser grids. 

Calculations have been performed on grids of 6kxl6xl6 cells in X,Y,Z 
directions, respectively. The grid was coarsened by eliminating every other 
mesh point in each direction. Most often a sequence of four grid levels 
was employed in the MG process. The set up of the program admits the use 
of a fifth level (Uxlxl) ^ one that takes the wing to be infinite. 

The equations resulting from the discretization in equation (lU) are 
sequentially solved on planes of constant Z (marching from the fuselage 
tow’^ards the lateral boundary) , each of which is swept by successive line 
overrelaxation along lines of constant X (XSLOR). Because of the local 
nature of relaxation schemes, it is convenient to store in virtual memory 
the coordinates and the solution of only the Z plane being swept and its 
two neighboring planes on either side, plus the old solution of the preceding 
plane. The coordinates and the solution on the entire domain are stored on 
a disk and information is buffered in and out of the virtual memory as 
needed, while calculations are being performed. Disk manipulation requires 
careful programming for transferring the potential and the weighted residuals 
to the next coarser grid, and in interpolating the corrections to the next 
finer level. The interpolation of the potential to the next coarser level 
(in equations (23) and ( 26 )) is done by ’injection’, i.e., values of the 
potential from the fine grid are transferred at points corresponding to both 
levels. Three fine grid planes contribute to the construction of the volume 
average of the residuals (in equation (2U)) as sketched in figure Ua. Four 
coarse grid planes participate in forming the four point Lagrangian inter- 
polation in the Z direction — see figure Ub. The same interpolation opera- 
tor is used within each of these planes to improve the solution at each of 
the fine grid points that lie on them (in equation (26)). The buffering of 
the potential for interpolating in the Z direction is somewhat complicated 
in that solution from the first and third preceding planes must be available. 


A fixed strategy using one sweep on each visited grid has been shown to 
be effective (ref. T)- The domain is swept once on each grid level until 
the coarsest grid is reached. Each level is swept once after coarse grid 
corrections are added while backing up to the second finest grid. ' This com- 
pletes a MG cycle. Thus, the work required for one MG cycle for a problem 
in d-space dimensions is 


^ ^ ^ ;ia 


^3d 


+ . . . ) < 1 + 


2^-1 


units , 


(27) 


where 1 unit is the work required for a fine grid sweep (ignoring the over- 
head due to transferring and interpolation) . For a three-dimensional prob- 
lem, this amounts to 1^/7 work units. The use of a fixed strategy rather 
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than an adaptive one simplifies the coding* Also, no switching criteria 
whose determination would have required numerous numerical experiments, need 
he specified. 

The adaptation of the MG procedure calls for an additional storage of 
2/(2^-l), which amounts to 2/7 of the storage needed for the solution of the 
fine grid. Half of this space is needed for storing the potential of all 
levels except the finest, while the other half is needed for the residuals on 
these levels. 

The need to use the buffering procedure is due to the limited virtual 
memory of the computer in use. Since only the potential and the coordinates 
of the points corresponding to the grid being relaxed need be retrieved, it 
follows that the buffering procedure employed is inefficient on coarse grids. 
This procedure was appropriate for the original code. We could have chosen 
one of the following three options for the modification of the retrieval 
procedure of the coordinates: 

-Making use of a special routine that coarsens/refines the grid. This 
would have still required the retrieval of the coordinates of the 
finest grid when relaxing on any level. 

-Utilizing the * random* access mode for buffering in just the coordin- 
ates of the points of the grid under treatment, skipping all the others. 
This is a quite expensive operation since the mode consists of a 
searching operator in addition to the retrieving operator. 

-Taking advantage of the fixed strategy, by constructing a disk file 
that contains the stored coordinates of all levels in a preset order. 
The arrangement is so made as to coincide with the strategy used. More 
specifically, the coordinates of the fine grid are put at the head of 
the disk, followed by the coordinates of the coarser grids in sequen- 
tial order, down to the coarsest level. The coordinates of every 
level are stored twice, because the grid is once relaxed and then 
swept for the calculation of the residuals. Next, the coordinates of 
the grids are stored in the reverse order, up to the second finest 
grid. The finest level need be stored only once, since for the 
residual claculation (for transferring to the next coarse level) the 
disk can be inexpensively rewound. 

Initial estimates indicated that the second option should be more 
efficient than the first, and it was coded. Subsequently, the third alterna- 
tive has been incorporated into the code, exhibiting an additional thirty 
percent reduction in the cost of computation. Adopting this option implies 
3/T storage extension of the space required for storing the fine grid coor- 
dinates. This does not cause any problems since here we utilize the disk 
storage. The use of a computer of larger storage capacity (either real or 
virtual) would allow the coordinates of the fine grid and the potential of 
all levels to be stored in memory, eliminating the need for the buffering. 

Special attention must be paid to the handling of the boundary condi- 
tions when implementing the MG procedure. As formerly described, the incor- 
poration of the boundary conditions on solid surfaces and on the vortex 
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sheet allows use of the same algorithm as at internal points. This treat- 
ment proves adequate for the adaptability of the MG, since it is non-disturb- 
ing to the interior smoothness. The Dirichlet boundary condition on the 
upstream and lateral boundaries does not affect the smoothness of the solu- 
tion at interior points, either. However, difficulty was encountered when 
imposing the Neumann condition for the velocity potential on the downstream 
boundary (which was done by setting the potential on the boundary plane to 
its value on the third plane upstream of the boundary) . This difficulty 
was overcome by invoking the Neumann condition directly in the following 
manner: fictitious cells are introduced next to the downstream boundary; 

their velocities in the streamwise direction (at the center of the cells) 
are calculated by extrapolation of the velocities at the centers of the cells 
of the immediate interior cells and the free stream velocity on the boundary; 
the standard algorithm is then applied for calculating the potential on the 
downstream boundary. Note the similarity of this approach to that used for 
the calculations at points on the solid surfaces and the vortex sheet. The 
special operator (for Vy = O) applied on the cut outboard of the wing tip 
requires careful treatment. The residuals at these points must be correctly 
scaled in order to make them comparable to the residuals at neighboring 
points . 

Our recoimnendation is that for a well-coded MG program the solution at 
all points of the computational domain including boundary points is to be 
calculated by the standard relaxation algorithm, excluding the points whose 
specified conditions are of the Dirichlet type. If special operators need 
to be devised on some boundaries, caution is required when implementing MG, 
to guarantee smoothness of the solution elsewhere. 

E. Predictability 


A local mode analysis provides a reliable estimate of the MG perfor- 
mance. It is assumed that the correction can be represented by a Fourier 
series, and it is of interest to follow the evolution of the amplitude of 
one mode. It is further assumed that periodic boundary conditions are 
specified. Noting that equation (6) is equivalent to ph/a^ times equation 
( 13 ), the potential equation in quasilinear form is utilized to construct 
the iterative scheme. Under the transformation in equation (20), the 
potential equation in cylindrical coordinates reads 






nn 


rr 
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( 28 ) 


the coefficients being functions of a, u^, u^ , u^ , r and the derivatives 
of the transformation. Locally freezing" the coefficients about the avail- 
able solution (the Fourier analysis being useful only for linear schemes), 
the XSLOR scheme yields for the growth factor G (the reduction of the error 
amplitude during one iteration) 
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where : 


w - overrelaxation factor 
Oi = i = 1.2,3 in 5,n,r 

- wave nimber 
hj^ - mesh spacing 


for subsonic regions, with a similar expression for hyperbolic regions. The 
growth factor is thus seen to be strongly dependent on the aspect ratios of 
the cells: = Ar/A^. 

The assumption of periodic boundary conditions implicit in these 
estimates can introduce substantial errors in problems of practical interest. 
Therefore one cannot always extract accurate predictions of rates of con- 
vergence for conventional relaxation schemes. In particular, the Fourier 
analysis is not accurate for low wavenumber modes, since they are affected 
by the boundaries of the domain. In contrast, the analysis is most reliable 
for those high wavenumber harmonics which interact at short distances. On 
this hinges the reliability of the estimates for MG; consideration need be 
given only to harmonics in the range tt/ 2 to rr (from the wavelength of four 
times the mesh spacing to the smallest discernible wavelength) , since the 
low frequency components of the error are effectively eliminated by the 
relaxation processes performed on the coarser grids in much less work. 
Therefore 


G = max , 7T/2 <_ tt (30) 

provides an estimate for the rate^of convergence of the MG algorithm. For 
the fixed strategy we have used, G = G is the growth factor per work unit. 

^Equation (29) is rather complicated for analytical treatment for seek- 
ing G. The worst growth factor (for a general relaxation scheme) can be 
found by inspecting modes of possible combinations of 0, tt/2 and ir for 
extreme values of the aspect ratios. This yields comparatively simple 
expressions for G. In figure % G is plotted against A^ for extreme values 
of A^ (0 and ~) for a specified subsonic uniform stream (the velocity vector 
has very little effect on G) , the coordinate r and the over relaxation factor 
0 ). The choice of r is not important since it is inversely related to A and 
it can be absorbed in the definition of A (which is checked for 0 and «» 
anyway) . These results suggest that MG is capable of significantly acceler- 
ating the original code and that regardless of the mesh in use, the upper 
bound for the growth factor is approximately 0.78 (^ " . 82 i+). 
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Although we did not call attention to it, the derivation presented is 
appropriate only for parts of the domain where the grid lines of the mapped 
mesh coincide more or less with the cylindrical coordinates x,0,r 

(simplification of the expressions for the coefficients in equation (28) was 
permitted by setting 1^[ = 1, ^0 = O). It is seen in figure 6 that this 
might be a good approximation only in regions downstream of the midchord of 
the wing. A more complete analysis requires further investigation of the 
network, such that the shearing transformation (21) is taken into account, 
as well as the orientation of the cells and their true aspect ratios. This 
can be achieved by resorting to numerical calculations of G by scanning the 
entire domain. The growth factor thus obtained (at the far upstream cell, 
close to the fuselage) was very close to the G estimated above. (For super- 
sonic regions in the vicinity of the wing G is smaller.) The fact that 
the G’s are virtually the same for this case should not be taken to imply 
that the more complete analysis can be overlooked in all cases. Two-dimen- 
sional calculations on a parabolic grid revealed that the more complete 
estimate was ’slower’ (it was equal to the ’rough’ estimate raised to the 
power of 1/1. U), and insisting on attaining rates of convergence predicted 
by a rough analysis, would have been futile. 

We remark that the calculation of the aspect ratios of the cells in 
two-dimensional meshes is greatly simplified by utilizing the conformal 
properties of the transformation (no distortion), so that aspect ratios are 
readily calculated from the physical coordinates. This, unfortunately , does 
not hold for the three-dimensional networks, and calculations must be 
carried out in the transformed space ^iU^r. Also, the problem of highly 
elongated cells in the two-dimensional grid aforementioned, was alleviated 
by ’redistributing’ the aspect ratios within the domain (which was most 
easily done by introducing a suitable stretching function), resulting in 
better rates of convergence — both theoretically and numerically. Such a 
cure cannot be prescribed for the three-dimensional networks used, since 
the aspect ratios of the cells are already quite uniformly distributed. 

Given an estimate for the rate of convergence, it is possible to 
estimate the computational effort required to solve the problem to the level 
of its truncation errors. Suppose the problem is first solved on the coarse 
grid G^”-! to within (t designates the truncation error). Assuming 

that high frequency errors are not introduced by interpolation to the finest 
level, the initial estimate for the G^ problem is already reduced to 
0 (t^“ 1). Thus, the work required to reduce them to O(t^) is 

= log 0(T^/T^'^)/log G. (31) 

K— 2 

Likewise, having an initial estimate of 0(x )s the work required to solve 
the G^”l problem (for the sjbrategy in use) to the level of x^"^ is 
2 log 0 (x^^1/tK- 2)/(2^ log G) , since it has 1/2^ as many grid points as the 
finest grid. For the second order scheme, 'v ^k^^K-1 " l/^^. There- 

fore, the computational work for solving the problem to the level of its 
truncation errors, is 
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(32) 


neglecting the work on the coarsest grid. 

This 5 of course, should not he taken as a practical terminating cri- 
terion of the relaxation process. Even if it does not represent the state 
of affairs exactly, it certainly constitutes a good approximation, and 
one can chose a stopping criterion depending upon the requirements of 
accuracy. The criterion used in practice should he determined hy a percep- 
tive interpretation of experimental results. The modified program has not 
heen devised to include a switch for terminating the computing process since 
our objective was to check the validity of the estimate in forming the basis 
for deriving a stopping criterion. 


III. RESULTS 

In the following we demonstrate the advantages of the MG procedure. 

The geometry tested is the ONERA wing M-6 mid-mounted upon a cylindrical 
body of radius of .25 the semispan. The wing has a 30° leading edge sweep, 
a taper ratio of .562 and a uniformly tapered cross-section of approximately 
10^ thickness ratio. A perspective view of the configuration is shown in 
figure 7. 

Computations were carried out on a mesh of 6Uxl6xl6 cells in the X,Y,Z 
directions, respectively, the crudest grid (corresponding to the fifth MG 
level) containing just k cells (it is a Uxixi grid). The first set of 
results was calculated for a lifting configuration with a moderately sized 
supersonic zone (containing approximately 2.% of the points on the grid used). 
The free stream Mach number was .8^^ and the wing and fuselage were at an 
angle of attack of 3«06°. The results show the effect of using different 
numbers of grid levels. Figure 8a displays the convergence histories of 
the average residuals, and indicates the beneficial effect of MG: while the 

convergence rate of the original scheme is .982, MG using a sequence of 
four grids yields a rate of .80 (more than 12 times faster than the original 
code). Without MG, the situation is aggravated on finer meshes (in fact, it 
can be predicted by expansion of G in equation (29) for low frequencies that 
the asymptotic convergence rate will be 1 - 0(h^)), whereas the performance 
of MG is independent of the fineness of the mesh. Therefore, the relative 
advantage of the MG increases as the mesh gets finer. The superiority of the 
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MG scheme can also he seen in figures 8h and 8c, which show the convergence 
histories of the circulation at the root section of the wing and the number 
of supersonic points detected in the solution. (in these figures the con- 
tinuous lines are drawn by connecting the points for the sake of clarity.) 

_ As described in the previous section, the estimated rate of convergence 
is ^ = .824, while the experimental result is slightly better: .80. That 

the theoretical prediction tends to underestimate the performance of MG 
algorithms was also reported in reference 3. An MG code can be considered 
to be correctly programmed and free of bugs and flaws (which are most common- 
ly inflicted by incorrect handling of boundary conditions), as long as the 
experimental rate of convergence is bounded by the predicted rate. Also, 
these results suggest that the choice of a fixed strategy was adequate. 

The rate attained in our calculations bears out Brandt’s assertion that one 
should not settle for any convergence rate slower than that predicted by a 
local mode analysis. 

The estimated computational work required for obtaining converged solu- 
tions which are at the level of the truncation error is = 9-2 work units 
(WU). This follows from equation (32) when inserting the theoretical value 
for the rate of convergence. We prefer to use this value for the rate rather 
than the convergence rate obtained experimentally, since this yields^a more 
conservative criterion for stopping the relaxation process ^letting G = .80 
yields Wk = 8 W) . This is ’safer’ since the experimental G may increase 
(hopefully still bounded by .824) when using different meshes or treating 
different configurations . 

Surprisingly, the solution obtained by five grid levels converges faster 
than the MG that employs just four grids, although the rate of convergence 
of the last is marginally higher. The 'solutions obtained in both cases are 
identical, even though on the fifth grid the no flux condition on the wing 
is extended to the region outboard of the wing tip to the lateral boundary. 
Referring to figure 6, this region lies between the dashed line representing 
the singular line of the conformal map (in equation (20)), and the dashed 
line leaving the trailing edge of the wing. It was reported in several pub- 
lications that the coarsest possible MG level has a negligible effect in 
improving the performance of MG that uses a sequence of levels excluding the 
coarsest , To explore the difference between MG employing five grids and MG 
that uses four grids, we list the error in the solutions after 9-2 TO as 
compared to their converged values (refer to figures 8b, 8c); 

circulation number of supersonic points 

4 grids 2 % 2 . 1 % 

5 grids 1 % 1 . 6 % 

It appears that the four-grid MG will require about three more work units to 
achieve the value of the circulation obtained by the five-grid MG after 9*2 
WU. The relative advantage of the scheme employing all possible coarse 
levels decreases as we pose requirements for higher accioracy. For example, 
if this were the case, we would have to continue the relaxation process up 
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to 15.1 WU for the five-grid MG or I 6 .U WU -when using just four grids, in 
order to guarantee a converged circulation to within O.OT^ and to capture 
all supersonic points. Note that it is only after 16 iterations that the 
original code starts to detect any supersonic points at all. 

In figure 9 the streamwise pressure distributions are plotted for the 
.3, -6 and .9 semispan stations. Comparison is made between a well converged 
solution and the solution obtained by the four level MG after 8.T WU. Notice 
the typical pattern of the leading edge shock and the trailing edge shock 
(which is smeared out because of the poor resolution on the aft portion of 
the wing caused by the parabolic nature of the mapping) that merge as we 
proceed outboard. The differences in the pressures are rather small and 
are limited to the region between the shocks. The discrepancies seem to be 
larger as we approach the tip. After 8.7 WU the lift, drag and moment 
coefficients of the wing were converged to within 2.6%, 5*0^ and 2 . 5 % 9 
respectively. A fifth coarse level could probably provide a better converged 
solution. 

Next, we consider a uniform free stream at a Mach number of .923 and 
zero angle of attack, resulting in a non-lifting flow for this symmetrical 
geometry. A well converged solution indicates that the flow velocity is 
supersonic at 6.7^ of the mesh points. Convergence histories of the average 
residuals and the number of supersonic points are displayed in figure 10, 
in which comparison is made between the four -grid MG scheme and the relaxa- 
tion scheme without the MG. The rates of convergence of both cases are 
almost identical to the corresponding rates attained for the lifting case. 

At the estimated computational effort required for convergence (9-2 WU) , 

96 % of the total number of supersonic points were established. After I 6 .U 
WU, the number of supersonic points had converged to within .6^ of the total 
number . 

The streamwise pressure distributions are presented in figure 11 for 
the . 3 , .6 and .9 semispan locations. As in the lifting case, we show the 
deviation of the pressure distributions achieved by the four - level MG 
scheme after 8.7 WU from those of a well converged solution. Differences 
are minor in the vicinity of the fuselage and they become more prominent as 
we proceed outboard. Unlike in the lifting case, the overall drag coeffi- 
cient of the wing was well converged at the end of 8.7 WU having an error 
of l.h 5 % vs. 5 * 0 % for the lifting case. 


IV. CONCLUSIONS 

The MG technique has been incorporated into an existing computer pro- 
gram that calculates the transonic potential flow past wing-fuselage com- 
binations. The program uses a conventional SLOR/rotated-difference 
smoothing algorithm to calculate mixed elliptic-hyperbolic flowfields that 
contain discontinuities. The computational effort when solving on a rather 
coarse grid ( 6 i+xl 6 xl 6 ) is reduced by an order of magnitude for a given 
accuracy. The merit of the method becomes more prominent when calculating 
on finer meshes which are of engineering interest. The rates of convergence 
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of the MG algorithm are in remarkably good agreement with theoretical esti- 
mates from a local mode analysis , even on the curved and highly stretched 
mesh of the present computations. ¥e stress that it is of great importance 
to find the maximum growth factor by such an analysis in the early stages 
of developing a MG code. Although the expression for the growth factor may 
be quite complicated, it is worthwhile to extract from it as much informa- 
tion as possible so as to be aware of what might be expected from the pro- 
gram, and also for systematically optimizing the mesh. 

The modified MG program should be subject to further study of other 
practical configurations . Also, the MG technique may be utilized for in- 
creasing the accuracy in various ways. For example, by sequential refine- 
ment which can be employed in regions of the flowfield where high derivatives 
of flow properties are likely to occur (as in the vicinity of shocks, at 
the leading and trailing edges of the wing and at the wing tip), or, by 
extrapolating the truncation errors on coarse grids (which implies the need 
for minor changes in the forcing term in equation (23))- 
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Figure 3. - Sketch of computational domain. 
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Figure U. - Schematic representation of residual veighting and correction 
interpolation. 
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Figure T« - Perspective view of OWERA wing M-6 mid-mounted upon cylindrical 
fuselage. 
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Figure 9 . - Streamwise wing surface pressure distributions for = . 8 ^ and 
3.06° angle of attack. 







Figure 10. - Convergence histories of average residuals and number of super- 
sonic points for = .923 and zero angle of attack. 
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A MULTIGRIO MESH-EMBEDDING TECHNIQUE 
for 


Three-Dimensional Transonic Potential Flow Analysis 

By: Jeffrey J. Brown 

Boeing Commercial Airplane Company 


ABSTRACT 


A method for obtaining the fine detail of a transonic flowfield 
is presented. The technique employs the multigrid method to embed 
very dense meshes in regions of interest. Accurate results are 
obtained on meshes of a heretofore unobtainable density with reasonable 
computer expenditures. Comparisons of results with data reveal 
accurate predictions in the supersonic bubble of a transonic inlet, 
an area which is incorrectly predicted by existing techniques. 

More accurate results are also obtained with the new method on 
a mesh of a density comparable to existing codes and at a lower 
cost. 
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NOMENCLATURE 


English 


a 

speed of sound 

F 

forcing function 

G 

grid level identifier 

I 

interpolation operator 

L 

differential operator 

q 

freestream velocity 

r 

radial ccordinate 

z 

axial coordinate 

Greek 


Y 

ratio of specific heats 

0 

circumferential coordinate 

4* 

potential function 

Superscript 


k 

grid level 

Subscript 


k 

grid level 

OO 

freestream condition 
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INTRODUCTION 


The standard relaxation methods used in large-scale fluid- 
dynamics computations are local by nature. The current solution 
at each grid point is influenced solely by information from neighbor- 
ing points. Consequently, the global rate of information transmission 
is asymptotically slow and is aggravated the more dense a mesh 
becomes. The result of this situation is an inability to economically 
predict the fine details of a flowfield (i.e., peak Mach numbers, 
shock locations, etc.). Indeed, the computing time required to 
obtain the fine details of a flowfield seriously limits the usefulness 
of many realistic production codes. This is especially true for 
design applications which are, by nature, iterative processes. 
Consequently, methods for increasing the efficiency of the relaxation 
process have received much attention. While gains have been 
made, success has seldom been dramatic, often relying upon highly 
problem-dependent assumptions. 

Recently, however, new mathematical techniques, referred 
to as multigrid methods, have been proposed by Brandt (references 
1 and 2). These methods theoretically offer from one to several 
orders of magnitude improvement in execution time and provide 
greatly improved accuracy as well. Brandt has demonstrated remarkable 
success with two-dimensional elliptic problems of generally academic 
interest. The applicabilty of the multigrid method to transonic 
fluid dynamics computations (a mixed hyperbolic-elliptic problem) 
was demonstrated by McCarthy and Reyhner (reference 3). They 
incorporated the multigrid procedure into the Reyhner code for 
three-dimensional transonic potential flow around axisymmetric 
inlets at angle of attack (reference 4). 

The McCarthy-Reyhner code is a finite-difference, non-conserva- 
tive, successive-line-over-relaxation (SLOR) scheme which operates 
in cylindrical coordinates in the physical domain. It uses a 
hierarchy of four mesh densities, the finest of which (level 4) 
is roughly twice as dense as the finest practical mesh available 
with the Reyhner code. Very accurate results are obtained with 
the McCarthy-Reyhner code on level 4 in approximately three minutes 
CYBER-175 central processor (CP) time. As a comparison, it has 
been estimated that it wou\d require six hours CP time to achieve 
similar results with the Reyhner code (modified for level 4). 

As dramatic as these results are, experience with the McCarthy- 
Reyhner code has indicated that there are regions of a flowfield 
(e.g., the highlight region in an engine inlet) where level 4 
is not sufficiently fine to accurately determine the details of 
the flowfield. It would not be practical, from a computer storage 
requirement, to extend the McCarthy-Reyhner code to level 5 (twice 
as dense as level 4) to attempt to resolve the fine detail of 
the flow. 
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The present work describes an investigation of a technique 
for embedding very dense meshes in regions of interest. The approach 
involves modifying the McCarthy-Reyhner code to embed a series 
of dense meshes in the region of an inlet highlight. Global informa- 
tion is transmitted to the embedded meshes from the coarser meshes 
via the multigrid procedure. Likewise, the fine detail of the 
flowfield is conveyed to the coarser meshes during the multigrid 
process. Hence, the solution on each grid (embedded and full domain) 
is corrected by information transmitted by the multigrid process 
from the other grid. 


The author gratefully acknowledges the work of Gary E. Shurtleff 
of Boeing Computer Services, Inc., who performed the computer coding 
in a timely and efficient manner. 

MULTIGRID ALGORITHM 

A brief description of the general multigrid method is presented 
for completeness and to introduce terminology. After the discussion 
of the general procedure, the mesh-embedding philosophy is discussed. 
Reference 1 should be consulted for a detailed description of the 
multigrid method. 


( 1 ) 


( 2 ) 


The objective is to solve the potential flow equation 

L[^(r,0,z)] = F(r,0,z) 

where L is a differential operator defined as 


-2 ^*,6 


where 


a* = 





and 


F(r,e,z) = 0 . 
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The (finite difference) multigrid method replaces equation 
(1) with a collection of finite difference equations 

l’'**' = f‘‘ (3) 

In equation (3), L represents a discretized version of the operator 

k k k 

L, and and F represent scalar fields defined on a grid G 

which is one of a hierarchy of grids of varying coarseness. 

(4> is the exact solution to the discretized equation.) 

An economical approximate soluton to equation (3) can 

k-1 

be obtained by interpolation from a coarser grid G . Grid 
k-1 k 

G is obtained from grid G by deleting every other grid line 
k k-1 

from grid G . On grid G , the discretized equation is 

Lk-1 = pk-l 


k 1 

When the approximate solution ((> “ to equation (4) is obtained, 

1 / 

it can be interpolated to grid G as follows 

(5) 

k k-1 

where is an interpolation operator from grid G to grid 

G*^. This procedure has been used by several authors (e.g., reference 

5) to obtain a solution to equation (3) as a sequence of solutions 

k k 1 k 2 

on coarser meshes (i.e., G , G " , G “ , etc). The next natural 

step is to ask whether one can exploit the proximity between the 
k k-1 

G and G problems not only in generating a good first approximation 
on G , but also in the process of improving the first approximation. 

This can be done and is the crux of the multigrid philosophy. 

By taking this essential step, the errors on grid G can 

k-1 

be smoothed inexpensively and efficiently on grid G . At any 
point in the solution process on grid G , one has the approximate 
solution to equation (3), One can formally define the error, 

4'*^, on grid G*^ as: 

<!•''= (^•'+ 4'*' (6) 
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where ♦ is the exact solution to equation (3). After several 

k k 

relaxation sweeps on grid G , the error V is smooth. Hence, 

k 1 

a good approximation of Y ” can inexpensively be computed on 
k-1 

the coarser grid G . For this purpose, the fine grid equation 

+ = f'-lV = r" 

is approximated by the coarse grid equation 

l"’ (iJ-V"* 'f''’ 

k-1 1 

where I k need not be the same as Ik* 


(7) 

(8) 


By defining 

= Ik*’ 

equation (8) can be rewritten 

Tk'’ rS l"’’ I k*’<^'' = f“ (9) 

The new unknown (p ~ represents, on the coarse grid, the sum 

, k k 

of the basic approximation 9 and its correction error 9 . 

When the approximate solution 9*^”^ to equation (9) is obtained, 
it can be employed to correct the approximation on the fine grid 
as follows 

^NEW “ 9olD ^k-1 ^ *^OLd) 

When this procedure is extended over several grids (four 
in the McCarthy-Reyhner code) it yields accurate solutions in 
the equivalent work of only a few sweeps of the finest level. 

This is because the global errors are smoothed efficiently and 
inexpensively on the coarse mesh. 


MESH EMBEDDING 

In regions of high gradient, a dense mesh is required to 
resolve the fine details of the flowfield. Away from the regions 
of high gradient, the dense mesh is not needed. Extending the 
dense mesh over the entire domain is actually counterproductive 
(particularly without multigrid) because of the extremely slow 
rate of convergence on a dense mesh. 
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Fine flow detail can be obtained by embedding a dense mesh 
in regions of interest. This is easily implemented in a multigrid 
procedure by merely redefining the domain of the finest level 
to be a subset of the overall computational domain (cf.. Figure 
1). The computations on the embedded mesh are then restricted 
to the embedded domain. The coarser grid then serves a dual purpose 
in the multigrid procedure. On the subdomain where the embedded 
mesh is not defined, the coarse grid has the role of the finest 
grid and the original difference equation (equation 3) is solved 
in that region. At the same time, on the subdomain which is coexten- 
sive with the embedded mesh, the coarse grid serves for calculating 
the coarse-grid correction, equation (10). 

k+1 

Understanding of this process is facilitated by letting G 

denote the embedded mesh and by defining G^., as the set of points 
k k+1 

of grid 6 where the G difference equations are defined (i.e., 
k k+1 

the points of G which are also points of G , cf.. Figure 2). 

1/ 

The difference equations on grid G are accordingly modified as 
follows 

l'‘4>‘‘=f'‘ (11) 


where 


In g''- 


and 

F^ = 

F ^ 

k + 1 

in 

el;*, 

where 

F^i = Ik + 1 ( 

^ ^ k + 1_ 

L. 

l^k + 1 

4,-’) 



~*k k 

In equation (11), F may be regarded as the usual G right- 

k k+1 k 

hand side (F ), corrected to achieve G accuracy in the G solution. 


Figure 2 illustrates a typical embedded mesh. On the boundaries 
of the embedded domain (exclusive of solid boundaries), constant 
k+1 

potential, 4 > , boundary conditions are imposed. The values 

k+1 

of <Ji for the constant potential boundary conditions are obtained 

k * 

by interpolation from the next coarser grid, G . 
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RESULTS 


The present study employs a hierarchy of five mesh densities. 
The axial and radial step sizes in the coarsest mesh, level 1, 

need not be uniform. Once the level 1 mesh, 6^ is defined, all 

lx 

finer meshes, G , are obtained by halving the local axial and 

k-1 

radial step sizes from the coarser mesh, G . The step size 
in the 0 -direction is held constant for all levels. Levels 1, 

2, and 3 extend over the entire computational domain while levels 
4 and 5 are embedded meshes. The analyses to date have employed 
co-extensive domains for levels 4 and 5. Table I shows the number 
of points in each of these five levels for a typical test case. 

The current analysis studied the flowfield around the NASA 
TM X-2937 1.26 contraction ratio inlet (reference 6) at angle 
of attack. The geometry of that inlet is axisymmetric and includes 
a high degree of turning in the highlight region of the lip. 

This turning provides a difficult test case for analysis. Figure 

3 illustrates the results of several analyses of the NASA inlet. 

This test case analyzed a freestream Mach number of 0.13, an angle 
of attack of 30°, and a throat Mach number of 0.48. Figure 3 

is a plot of surface Mach number verses surface arc length for 
the windward lip region. The negative arc length depicts the 
external surface while the positive arclength corresponds to the 
internal surface. A comparison of analytical results to experimental 
data is illustrated. 

Figure 3a illustrates the results of an analysis of the test 
case with the Reyhner code. The key discrepancy in the Reyhner 
code results is the underprediction of the Mach numbers in the 
supersonic bubble. It is suspected that this underprediction 
is due to either a lack of resolution in the lip region (i.e., 
the computational mesh is not adequately dense) or to viscous 
effects in the data which the potential flow analysis can not 
determine. The mesh in the Reyhner code can not be made finer 
for reasons discussed above. Thus, the present study was undertaken 
to address this question. 

Figures 3b and c illustrate the results of an analysis of 
the same test case with the modified McCarthy-Reyhner code. Figure 
3b depicts the results with an embedded level 4 mesh. The level 

4 mesh is approximately four times as dense as the mesh employed 
in the Reyhner code analysis. Examination of Figure 3b reveals 

a more nearly accurate prediction of the supersonic Mach numbers. 
Figure 3c shows the results with an embedded level 5 mesh (level 
4 is embedded as well). The overprediction of the peak Mach number 
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TABLE I 

TYPICAL MESH DENSITIES 


LEVEL 

NUMBER 

OFZ 

MESH 

NUMBER 

OFR 

MESH 

NUMBER 

ofO 

MESH 

TOTAL 

1 

15 

15 

5 

1125 

2 

29 

29 

5 

4205 

3 

57 

57 

5 

16245 

4 TOTAL 

113 

113 

5 

63845 

4 EMBEDDED 

41 

70 

5 

14350 

5 TOTAL 

225 


5 

253125 

5 EMBEDDED 

81 

139 

5 

56295 
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can be explained by considering the flow in that region as a simple 
Prandtl -Meyer expansion. Apparently, either viscous interaction 
tends to mitigate this expansion causing the lower peak Mach number 
in the data or the high peak Mach number was not measured. This 
phenomenon is not resolved by the coarse meshes of Figures 3a 
and 3b. An excellent prediction of the remaining supersonic Mach 
numbers is obtained by the level 5 results. 

Figure 3 shows that the external surface Mach numbers are 
underpredicted by all three analyses. The magnitude of the error 
is constant, thus the underprediction is apparently due to viscous 
effects (the mesh refinements have not yielded any improvement). 

Figure 4 shows the results of an analysis with level 4 extended 
over the entire domain compared to an embedded level 4 analysis. 

The case analyzed was the NASA 1.26 contraction ratio inlet at 
the same flight Mach number and angle of attack but with a throat 
Mach number of 0.64. It is apparent from this comparison that 
an analysis on level 4 does not need to be extended over the entire 
domain. Restricting attention to an embedded domain will yield 
comparable accuracy in, for this test case, one-half the CP time. 

Table II illustrates a comparison of the computational work 
and level of accuracy obtained using various mesh densities for 
a typical analysis. The measure of convergence employed in the 
present study is A4* » the average change in potential, , from 
one sweep to the next. This is used because it is not convenient 

to obtain the actual residual, F*^ This number (a<^) can 

be misleading when comparing the accuracy obtained on different 
meshes. Therefore, the maximum variation in mass flow rate from 
the enforced mass flow rate at the compressor face is calculated 
and is indicated in Table II for completeness. The mass flow 
error across the shock wave is constant for each mesh (0.6 percent). 
The work unit quoted in Table II is an amount of computational 
work equal to one relaxation sweep over a full level 4. An interest- 
ing observation from Table II can be made in regard to the embedded 
level 4 solution. That solution was obtained in one-half the 
computational work of the Reyhner code solution (i.e., the version 
of the code without multigrid). When one considers the increased 
accuracy of the embedded level 4 solution along with the decrease 
in computational effort one begins to appreciate the power of 
the multigrid code. 
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TABLE H COMPUTATIONAL WORK 
AND ACCURACY COMPARISONS 



REYHNER 

CODE 

McCarthy 

REYHNER 

CODE 

MODIFIED 
CODE- 
EMBEDDED 
LEVEL >3 

Number of 
Sweeps 

150 

13 

13 

Work 

Units 

39 

10 

10 

CPU Seconds 
(Cyber 175) 

138 

50 

50 

A m 

2.6% 

1.6% 

1.6% 

Number of 
Sweeps 


14 

14 

Work 

Units 


33 

19 

CPU Seconds 
(Cyber 175) 


180 

98 

A m 


1.07% 

1.24% 

Number of 
Sweeps 



16 


Work 

Units 





































CONCLUSIONS 


A method has been developed for utilizing the multigrid heirarchy 
of meshes to embed very dense meshes in regions of high gradient. The 
new method provides accurate, economical solutions to real problems of 
engineering interest. The embedded dense meshes yield the inviscid fine 
detail of the flowfield. Without multi grid, the fine detail was not 
economically obtainable. 
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FIGURE 1. COMPUTATIONAL DOMAINS 
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146 


MACH NUMBER, M 



0 I I- ■' I I I' 

- 0.1 0 0.1 0*2 0-3 0-4 0-5 

NONDIMENSIONAL ARCLENGTH ALONG SURFACE FROM HIGHLIGHT, S/L 
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147 


MACH NUMBER, M 



NONDtMENSTONAL ARCLENGTH ALONG SURFACE FROM HIGHLtGHT, S/L 


FIGURE 3C. NASA TMX- 2937 IN LET WINDWARD LIP Mcp - 0.452 M«,-0.13 0-30 

148 


MACH NUMBER, M 


« 


FULL LEVEL 4 SOLUTION 
EMBEDDED LEVEL 4 SOLUTION 



0 


-0.1 0 0.1 0.2 0.3 0.4 0.5 

NONDIMENSIONAL ARCLENGTH ALONG SURFACE FROM HIGHLIGHT, S/L 
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A MULTIGRID ALGORITHM FOR STEADY TRANSONIC POTENTIAL FLOWS 
AROUND AEROFOILS USING NEWTON ITERATION* 

J.W. Boerstoel 

National Aerospace Laboratory NLR 
Amsterdam, The Netherlands 


Abstract 


1 . Introduction 


The application of multigrid 
relaxation to transonic potential-flow 
calculation was investigated. Fully 
conservative potential flows around 
aerofoils were taken as test problems. 
The solution algorithm was based on 
Newton iteration. In each Newton itera- 
tion step, multigrid relaxation was 
used to calculate correction potentials. 
It was found that the iteration to the 
circulation has to be kept outside the 
multigrid algorithm. In order to obtain 
meaningful norms of residuals (to be 
used in termination tests of loops), 
difference formulas with asymptotic 
scaling were introduced. Nonlinear 
instability problems were solved by 
upwind differencing using mass-flux- 
vector splitting instead of artificial 
viscosity or artificial density. It was 
also found that the multigrid method 
cannot efficiently update shock posi- 
tions due to the (mainly) linear char- 
acter of individual multigrid relaxa- 
tion cycles. For subsonic flows, the 
algorithm is quite efficient. For 
transonic flows, the algorithm was 
found robust; its efficiency should be 
increased by improving the iteration 
on the shock positions; this is a 
highly nonlinear process. 


* The study was performed under con- 
tract for the Netherlands Agency for 
Aerospace Programs (NIVR), contract 
number 1853. 


Most computer codes for the cal- 
culation of transonic potential flows 
are based on the solution of a large 
finite-difference equation system by 
some nonlinear relaxation algorithm. 

The development of. these algorithms 
started about a decade ago with work 
of Murman and Cole, who applied upwind 
differencing in supersonic zones to 
generate directional bias (Ref. l). 

The most important improvements since 
then were the introduction of the con- 
cept of full discrete conservation 
(Murman, Ref. 2), the extension to the 
full nonlinear potential-flow equation 
(Jameson, Ref. 3), and the application 
of results of tensor theory to allow 
non-orthogonal curvilinear grids, so 
that grids can be easily aligned with 
complex flow^ boundaries (Jameson e.a.. 
Ref. U, Caughey e.a.. Ref. 5)- An 
impression of the state-of-art may be 
obtained from references 6 and T. 

During the last few years , 
numerical analysts have proposed 
various new fast-solver algorithms 
that perhaps may also be used to 
solve finite-difference equations for 
transonic potential flow more effi- 
ciently than nonlinear relaxation 
algorithms. The most interesting fast 
solvers are CR/FFT (cyclic reduction/ 
fast Fourier transformation), AF 
(approximate factorization), ILU 
(incomplete lower-upper decomposition), 
and MGR (multigrid relaxation). For 
transonic potential- flow calculations, 
fast solvers of wide applicability are 
of particular interest because of the 
complexity of the potential- flow 
equation (nonlinear, of elliptic- 
hyperbolic type, singular at shocks 
and sonic lines). 
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The application of multigrid linear version was applied "because 

methods to transonic potential-flow the convergence analysis is consider- 
calculations was investigated by ably simpler. 

several authors (Fuchs, Refs 8,9; South Two-dimensional full potential 

and Brandt, Ref. 10; Jameson, Ref. 11; flow around an arbitrary given aero- 
McCarthy e.a. Ref. 12; Ar linger. Ref. foil was used as a test problem. The 
13). Interesting results were obtained flow equations were discretized in a 
by Fuchs for two-dimensional transonic fully conservative manner on an 
small-perturbation flow around non- approximately orthogonal grid of 0- 

lifting symmetrical aerofoils. Some type. The grid is aligned along the 

combinations of the various finite- aerofoil. 

difference equation systems and various The nonlinear finite-difference 

versions of multigrid relaxation algo- equations are presented in section 2, 
rithms tested by Fuchs turned out to be the main structure of the solution 
very efficient. The other authors also algorithm in section 3, the multigrid 
reported promising results. Approximate process in section U, and the relaxa- 
factorization techniques have also been tion technique applied in the multi- 
applied with success (Holst, Ref. lU; grid process in section 6. Results of 
Baker, Ref. 15; Catherall, Ref. 16). As numerical experiments and a concluding 
a function of the number of grid points discussion form the last two sections, 
approximate factorization is theoreti- Some stability considerations are pre- 
cally asymptotically slower than multi- sented in section 5- 
grid relaxation, however. ILU methods 

have not yet been applied to transonic 2. Finite-difference equations 

problems. The application of CR/FFT to 

transonic flow problems turned out to The finite-difference equations 

be not quite successful. to be solved are defined on a grid of 

0-type. Such a grid may be generated 
The present study concerns the by a mapping from an equidistant grid 
design of a fast-solver algorithm for in a computational (C5n)-plane to the 
transonic potential-flow calculations physical (x,y)-plane. The mapping used 
using Newton iteration and multigrid here consists of a sequence of a few 
relaxation. simple transformations, illustrated in 

From preliminary investigations it was figure 1 : a conformal Karman-Trefftz 
known that Newton iteration (exact or transfo2nnation , followed by simple 
approximate) was promising (Boerstoel, correction transformations. A Karman- 
Ref. 17; Fuchs, Refs 8,9; Piers and Trefftz aerofoil is (crudely) fitted 
Slooff, Ref. 18 ). The Newton iteration to the aerofoil such that the aerofoil 
technique was also proposed by becomes a smooth near-circle under the 

Hackbusch to solve other nonlinear corresponding Karman-Trefftz transfor- 

problems than transonic problems mat ion. The trailing-edge corner is 

(Ref. 19 ). thereby removed. The subsequent trans- 

Within each Newton iteration step , formations map the aerofoil into a 
a linear correction problem has to be circle ( stretching, and shearing in 
solved. This is done with multigrid radial direction), and introduce a 
relaxation. stretching far from the aerofoil in a 

Various multigrid relaxation direction approximately normal to the 

algorithms exist (Brandt, Refs 20,21): streamlines, with a stretch factor 
for example, a nonlinear version known (l-M^)^/^. Near the trailing edge, the 
as the FAS (full approximation storage) total mapping was designed to be con- 
method, and a linear version known as formal, to first order in 
the cycle C method. In this study, the 
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( 2 ) 


this permits easy implementation of 
various forms of the Kutta condition. 

An example of a grid is presented 
in figure 3. As shown, the grid is 
truncated far from the aerofoil {k to 
1 0 chords ) . 


Because, in fast-solver algo- 
rithms, the corrections to given 
approximate solutions are in general 
much greater than the (usually small 
and smooth) corrections in nonlinear 
relaxation algorithms , fast-solver 
algorithms are considerably more prone 
to nonlinear instability. In the ini- 
tial stages of the study it was found 
that various forms of artificial- 
viscosity terms gave often rise to 
expansion shocks, in particular on 
coarser grids , and also at tops of 
supersonic zones. A nonlinear finite- 
difference equation system with excel- 
lent stability properties was obtained 
by introducing directional bias with 
mass-flux- vector splitting; this is a 
generalization to full potential flow 
of a concept applied by Engquist and 
Osher to stabilize the fully-conserva- 
tive difference equations for transonic 
small-perturbation flow (Ref. 22). 
Numerical experiments revealed, that it 
was also necessary to compute the 
density at cell-face centres instead of 
at cell corners. (Computation of the 
density at cell corners is usual 
practice in most computer codes.) 

The finite-difference equation 
for the mass conservation equation of 
each cell (i,j) on the computational 
plane has the form (."^ means transpo- 
sition) (see Fig. U) 



= 0 


( 1 ) 


where V is a second-order accurate 
discretization of the gradient opera- 
tor ( 3/35,3/3n) » see below for details. 
F^ is a discrete mass-flux vector with 
three components (hence, the term 
mass- flux- vector splitting) : 


= F - F®" + F®"^ , 

with F the usual mass-flux vector: 

F = p h U , (3) 

U = G V(p , (4) 

p = {1_l(Y_l)M2(l_q2)}1/(Y-l) ^ 

(7ip)^ G V4> . (5,6) 

G is the contravariant metric tensor, 
and h the determinant of the mapping 
(SjTl) (x,y). Velocities q and den- 
sities p have been scaled by their 
free-stream values. The mass-flux 
vector F^ is nonzero in supersonic 
zones : 

F^ = if M ^ 1 then 0 else 

. T o 

{(pq - p*q*)/q} h (U U'^/q^) V 4 ) , 

where M is the local Mach number, p* 
and q* the sonic values of the density 
and the speed, and U is the 2^2 
matrix defined by the exterior pro- 
duct of U with itself. The mass-flux 
vector F^^ is equal to F^ at the 
centre of the upstream cell-face. (F^ 
will be computed at centres of cell 
faces) . 

The components of the mass -flux 
vectors F and F^ are computed at the 
centres of cell faces with second- 
order accurate central-difference and 
central-averaging formulas applied to 
cp. The Mach number test involved in 
the calculation of F^ is also made at 
the cell-face centres; this implies 
that the Mach number test for F^^ is 
made at the upstream cell-face centre; 
it will be seen below that this has 
interesting consequences for the con- 
struction of sonic and shock opera- 
tors . The metric data are assumed to 
be known at the cell-face centres . 

Physical and mathematical pro- 
perties of the mass-flux vectors and 
their discrete divergences are readily 
obtained by decomposing the matrices 
G and (U U^/q^) using the orthonormal 
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matrix 


cp /q -cp /o' 
X ^ Y *“ 

Vi 



( 8 ) 


concepts is lost, hovever. 

At sonic lines, expansion shocks 
cannot occur hecause the mass-flux 
vector is forced to have a sonic 
magnitude : 


T 

V V = unxt matrix , 

and the relation G = , H the 

Jacobian of the mapping from the compu- 
tational space to the physical space. 

In (8), (s,n) are the natural coordi- 
nates of the flow. In particular, 

F, F^, and F-F^ depend on mass-flux 
vectors in natural coordinates as 
follows if M ^ 1 : 

F = h H"’’ V [ p iPg , P , 


F*^ = h H ^ V [p*q*, p 
F^ 5^ 0 , F^'' = 0 , 


(12) 


at the first supersonic cell-face 
centre after the sonic line. Approxi- 
mately normal shocks are allowed to 
become very steep, mainly because, at 
the first subsonic cell-face centre 
after the sonic line in the shock, 
the mass-flux vector F^ has the spe- 
cial value 


F* = h H-' V (p - p»i», 0 l’' , pd , p ^ j,ar _ p . p i u . 

F.p“=hH-1lrC P*q* ,P4>pl'' . > 1 . (,3) 


(9-11) 

These expressions show that, in super- 
sonic zones and in natural coordinates , 
the streamline component of F-F^ has a 
fixed sonic magnitude; the other com- 
ponent is zero because cpj^ = 0 . 

Because the scalar pq = p cp^, as a 
function of the speed q = , has a 

maximum at the sonic speed q*, the 
vector F^ measures the mass-flux 
excess in comparison to the sonic maxi- 
mum mass flux p*q*. Although F-F^ is a 
vector of fixed magnitude in natural 
coordinates, its divergence is gener- 
ally nonzero is not identically 

zero); this divergence is a measure of 
the convergence of the streamlines. 

It may be shown that , in smooth 
parts of supersonic zones, the implicit 
artificial viscosity generated by the 
divergence of F^ - F^^ is closely 
related to that of Jameson (Ref. 3), 
and to the viscosity encountered in 
the artificial-density used by Eberle 
(Ref. 25), Hafez e.a. (Ref. 26), and 
Holst (Ref. lU). At sonic lines and at 
shocks, the relation between the 
vector-split-concept and the artifi- 
cial-viscosity and artificial-density 


It may be shown that the finite- 
difference equations for transonic 
small-perturbation flow proposed by 
Engquist and Osher have simular pro- 
perties. They showed that their dif- 
ference formulas have stable and 
unique solutions (Ref. 23). The con- 
cept of mass -flux- vector splitting 
presented here is a formal generali- 
zation to the full nonlinear flow 
equation of their splitting. 

The precise definition of the 
discrete gradient operators V in 
(1,4,T) differs from the usual ones 
because asymptotic scaling is applied. 
This has been done to obtain useful 
norms of residuals to be used in ter- 
mination tests of iteration loops. 
Because of the grid stretching and 
the singular behaviour of the poten- 
tial near free-stream infinity, the 
residuals of sufficiently accurate 
approximate solutions need not be 
uniformly small over the entire grid, 
but are allowed to have a certain 
growth rate when tending to free- 
stream infinity. Efficient residual 
norms should account for the per- 
mitted growth rate. On 0-type grids. 
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the permitted growth rate may he ana- 
lyzed if finite-difference formulas 
with asymptotic scaling are applied. 

Asymptotic scaling naturally 
emerges on 0-type grids, if we require 
that the velocity must he approximated 
uniformly to 0(h^)^ (h“^ mesh size) for 
any sufficient smooth potential ^ 
having the expected asymptotic behav- 
iour when tending to infinity (n 4^ O). 
This requirement leads to an analysis 
of the relation between approximation 
errors of difference formulas and the 
asymptotic behaviour for n + 0 of all 
kinds of functions of such as 

potentials, metric data, mass-flux 
vectors, residues, etc. 

The main steps of the analysis are 
the following. The mapping from the 
computational to the physical plane is 
defined such that 


X 

-1 
= n 

cos 2 ttC + 


e y 

-1 

= n 

sin 2 tt^ + 

(14) 

. . • • , 


-1 

= n 

C (^) + r 

d (?) + ,( 15 ) 

4 = 

(5-6, 

00 

00 

+ constant , 

e = 

(1 - 

00 

( 16-17) 

Using 

these 

formulas , 

it may be shown 


that the metric constants in the 
expression (6) for q^ , 

q2 = G V(p 


indeed, if cpg and cp^ are approximated 
by difference formulas with an abso- 
lute accuracy of the order 
and respectively. 

In general, difference formulas 
for derivatives f^ of functions 
f(C 5 Ti) having an asymptotic power 
series of the form 


f(C,n) = c^(?) n '^ + 

+ c ^ ) n ^ + 


may be derived from the identity 




q+1 




( 20 ) 


-q-1 

( 21 ) 


by applying the usual central-differ- 
ence formulas to the terms i*) 

and (pi f), because these terms are 
of unit order in p. The resulting 
difference formulas are a mixture of 
numerical and analytical differentia- 
tion in p, and have an absolute accu- 
racy of order p“l”^(h^)^. 

These general considerations 
were used to define the discrete 
gradient operators V as follows. 
Indicate the usual averaging and 
first-order difference operators by 
with 


^5’ % 




^ f = (f . 1 . + f. ^ . ) / 2 , 
. . N f = (f._^T . - f. 1 . ) / 

5 (x,j) 1+2,0 1-2,0 

( 22 ) 


11 2 ^ o 12 ^ 22 2 

= g + 2 g ip^ + g 

( 18 ) 

and the derivatives of the potential 
have asymptotic magnitudes given by 

g^^= o(ri^) , g^^= o(n^) , o(rt^), 

«P = o(ti"'), 4 ) = 0(n"^). (19) 

^ ■ I 

It follows that q may be approximated 
with an absolute accuracy of 0(h^)^, 


then (see figure 5 for the stencils) 

V? . F*^= [6 + { 6 (n^ F^ + 

1 , j q 1 p 2 

- 2 q (n F^) } , 

^i+i,o ^ ^ 

- 2 n Pr ^ > n"^ 1 • 

5 1 + 2 sj 

(23-24) 
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V. .^1 (p = [Pf 5 { n~^ (n (p)}, 

i,j +2 55 n 

{6 (n^cp) - 2n (nip)} n"^]J -.i 
n n i,j+5 


2 = X. 

X ,2 1 


- (r/2Tr) arctan (3y/x). 

i»2 

(29) 


Asymptotic scaling is also applied in 
the retarded mass-flux vectors 
when retardation in n-direction has to 
occur: 

The Neumann boundary condition on 
the aerofoil surface is zero mass flux 
through the aerofoil surface. This con- 
dition has been implemented in two 
linearly independent ways in the 
finite-difference equation system: 

• The mass-conservation equation (l,23) 
of each cell (i,J-l) adjacent to the 
aerofoil image is modified by requiring 
that no mass enters the cell through 
the cell face (i,J-^) on the aerofoil 
image : 


(30) 


^2(i,J-^) - ° 


• The potential values j inside the 
aerofoil are coupled to t^e potential 
values in the flow field by applying 
the boundary condition at each cell- 
face centre (i,!-^) on the aerofoil in 
the form 


^2(i,J-i) ■ ° 


F^ and F^^ are thus not used in this 
boundary condition, so that a second 
row of potential values inside the 
aerofoil is not needed. 

The Dirichlet boxondary condition 
is applied on a large closed curve 
n = around the aerofoil (in calcu- 
lations, 4 to 10 chords from the aero- 
foil) : 


(25) X = X cos + y sin 

y = - X sin a + y cos a 

with a the free-stream incidence. 

CO 

Because the potential values are 
given on the free-stream boundary 
j = I instead of at the grid points 
(i,1;, one-sided second-order accurate 
difference formulas have been used at 
the free-stream boundary instead of 
the central formulas (24,25)* 

The circulation F is determined 
by the Kutta condition. Because near 
the trailing edge the grid is approx- 
imately conformal, the Kutta condi- 
tion may be given the form (4>^)^^ = 0 
if the flow is subsonic at the trail- 
ing edge. (4^^)te approximated by a 
central-difference formula. 

3* Solution algorithm 

The nonlinear finite-difference 
equation system is solved by a fast- 
solver algorithm based on the com- 
bined use of Newton iteration and 
multigrid relaxation. The main struc- 
ture of this algorithm is presented 
in this section. 

During the study it was found 
that due attention has to be paid to 
a few new problems. 

• Circulation changes give in general 
rise to an increase of norms of 
residuals. In order to prevent limit 
cycling or divergence of nested 
iteration processes (here, Newton 
iteration and multigrid relaxation), 
the increase must be allowed for in 
termination criteria of iteration 
loops . 

• The solution algorithm has to 
iterate on different types of non- 
linearity: a short-wavelength non- 
linearity at shocks with a length 
scale of the order of one mesh of the 


(2T) 


(28) 
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finest grid in a current grid sequence, 
and a mild nonlinearity elsewhere with 
length scales of geometric properties 
of the aerofoil such as the chord or 
the leading-edge radius. The short- 
wavelength nonlinearity is encountered 
during shock-position improvements. 

Both types of nonlinearity have to he 
processed after each circulation im- 
provement , because circulation changes 
give rise to changes far from the aero- 
foil (almost linear, length-scale the 
chord), at the leading edge (mildly 
nonlinear, length scale l.e. radius), 
and at shocks (strongly nonlinear, 
length scale the mesh of the finest 
grid) . 

It was also found that multigrid 
processes do not efficiently improve 
shock positions on fine grids if the 
shocks have to move over several ( say 
five) meshes of such grids. (Such move- 
ments may be easily required, by circu- 
lation updates, for example.) This may 
be explained as follows. Multigrid 
relaxation is based on the assumption 
that a correction to an approximate 
solution may be decomposed into a sum 
of short-wavelength and long- wave length 
components; the short-wavelength com- 
ponents are (efficiently) computed on 
the finest grid of a grid sequence, and 
the long- wavelength components are com- 
puted on coarser grids where less grid 
points are involved in the calculations. 
Such a linear decomposition of a cor- 
rection grid function in short- and 
long-wavelength components has sense in 
linear problems, and also in lineariza- 
tions of nonlinear problems. However, 
linearizations of the shock operators 
can at best estimate shock movements 
over one mesh of the finest grid. 

In fast-solver algorithms , the shock 
should be able to move over several 
meshes, however. The basic assumption 
of linearity of the multigrid relaxa- 
tion process conflicts thus with the 
nonlinearity of the shock-movement 
process. (This is also true for the 
nonlinear FAS-multigrid relaxation 
method proposed by Brandt (Ref. 21), 


because Brandt’s construction of the 
FAS method makes use of the linearity 
of the correction problem on the 
finest grid. ) Other details concern- 
ing multigrid relaxation and shock 
position updates are presented in 
figure 6. 

An iteration process in which 
these general considerations have 
been taken into account may be chosen 
to consist of an outer Newton itera- 
tion on the circulation F, and two 
inner iteration procedures , one for 
the calculation of corrections out- 
side shocks (whereby multigrid relaxa- 
tion is applied) , and one for the 
update of shock positions. This com- 
bination of iteration procedures may be 
represented by the following algorithm. 

initialise cp r : 
a 

until F_ accurate enough do 
a 

begin until flow equations at fixed 

F are solved do 
a — 

begin improve at fixed F by one 
multigrid cycle; 
improve shock positions with 
partial relaxation sweeps; 

end of iteration at fixed F ; 

compute error in Kutta condition; 

improve circulation estimate F ; 

a 

end of outer Newton iteration on F. 

The outer Newton iteration on the 
circulation is based on a split of the 
finite-difference equation system of 
the form 


L{(p(r)} = Q 


(31) 

{a(p(r)/3c>^^ = 0 

9 

(32) 
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where the last equation is the Kutta 
condition, and the first one represents 
all other equations. The solutions 
cp(rg^) of the nonlinear system (31) 
alone define a (nonlinear) relation 
between and { 3cp( ra)/95}te 9 
figure 7 for an illustration. The Kutta 
condition (32) means that we are inter- 
ested in the value V on the horizontal 
axis. We may iterate to this value as 
shown in the figure. The slopes needed 
in this Newton iteration process are 
estimated by numerical differentiation. 
On a fixed grid, usually three to four 
steps are sufficient to fix the lift 
coefficient to about three significant 
figures , 

In each step of the outer Newton 
iteration process to f, the nonlinear 
equation system (31 ) has to be solved 
for a fixed estimate of the desired 
value of the circulation. This is done 
iteratively. In each iteration step, 
the solution is first improved outside 
shock layers by one multigrid-relaxa- 
tion cycle (see section 4), whereby 
shock positions do hardly vary. This 
multigrid-relaxation cycle is followed 
by an update of shock positions with 
nonlinear relaxation sweeps on the 
finest grid in (small) subdomains 
around shocks (partial relaxation 
sweeps). In the first partial-relaxa- 
tion sweep, the subdomains to be 
relaxed cover only shock cells . In each 
subsequent sweep, the subdomains are 
enlarged by one row of cells upstream, 
above, below and downstream of the sub- 
domain relaxed in the previous sweep. 
This enlargement is necessary, because 
partial relaxation on fixed subdomains 
may lead to divergence due to increase 
of residuals at the boundaries of the 
subdomain. 

The termination of the iteration 
to a solution of (31): L{4)(Fa)} = Q for 
a fixed estimate of of the desired 
circulation is based on a test com- 
bining two criteria. When the circula- 
tion is not yet accurate enough, the 
iteration terminates as soon as the 


value of accurate that it 

may be reliably used to improve the 
circulation to a better estimate. 
However, when the circulation is 
accurate enough, the iteration to a 
solution is terminated when a norm of 
the residuals of the mass -conservation 
equations of the cells has become 
small enough. This test strategy 
drives the circulation as fast as 
possible to its final value. 

A suitable norm of the residuals 
was found to be the maximum norm 
(see (1)) 


max {(n-/n,)^ |V- . F j} h“ 


T d 

The scalxng of the residuals • F 
by reflects, that the residuals of 
sufficiently accurate approximate 
solutions are of order n“^ 0(h^)^; 
hence, they are allowed to grow with 
n 4* 0. The scaling by the mesh size 
h^ makes the norm nondimens ional 
(hence, mesh-size independent). A 
maximum norm is preferred over a rms- 
norm, because an rms-norm does not 
show that, in certain stages of the 
iteration process, large residuals 
may occur in very small regions. For 
example, when iterating at fixed Fg^, 
after each multi grid-relaxation sf^eep, 
the residuals of shock cells are 
usually at least an order of magnitude 
larger than elsewhere in the flow due 
to velocity overshoots (or under- 
shoots) as sketched in figure 6. Rms- 
norms (or average absolute-value 
norms) do not efficiently measure 
large residuals in such small regions, 
and cannot be safely used in termina- 
tion tests of loops. 

In subsonic-flow calculations, 
the partial-relaxation sweeps are 
suppressed. 

The Newton iteration process on 
a grid is started with an initial 
approximation of the potential that 
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is computed by solving the nonlinear 
finite-difference equation system 
( 31 , 32 ) on a coarser grid (mesh size 
doubled). This is repeated on a 
sequence of three to four grids. On the 
coarsest grid of this sequence, the 
entire calculation is started with a 
uniform-flow potential having no circu- 
lation. 


dcp”^ = dR^^ ^ . (35) 

The long-wavelength part of the 
correction dcp*^ will be computed on 
the coarser grids of the grid 
sequence. This requires the defini- 
tion of equation systems 

c“ difP’ = dR^ , n = , (36) 


4. Miiltigrid-relaxation cycles 


The multigrid relaxation cycles 
used in this study have a general 
structure closely resembling the cycle 
C algorithm of Brandt (Ref. 21). Its 
general structure is presented in 
figure 8a. It may be seen that the 
cycle starts with a linearization of 
the flow equations on the finest grid 
so that the whole cycle effectively 
represents one (approximate) Newton 
iteration step. 

A few details of the implementa- 
tion of this multigrid relaxation 
cycle are of special interest. 

• The restriction operations are 
applied to the complete linearized con- 
servation equations instead of resi- 
duals. This is done in such a way that 
the equation system on the coarser 
grids may be interpreted as approxima- 
tions to the mass conservation equa- 
tions on the finest grid. 

• Certain stability properties of the 
linearized flow equations are trans- 
ferred in a controlled way to the 
coarser-grid equations. See section 5 
for more details . 

As shown in figure 8a, each multi- 
grid relaxation cycle starts with a 
linearization of the nonlinear flow- 
equation system (31) on the finest 
grid around a given approximate solu- 
tion 4)^. The result is a linear first- 
variation equation system for a first- 
variation potential dcp^ on the finest 
grid. 


m 

cp = 


. m 


4> + dcp 


m 


(3U) 


for these long-wavelength parts on 
the coarser grids by a restriction 
process . In order to obtain simple 
restriction rules based on the inter- 
pretation of the linearized equation 
system (35) as a system of mass-con- 
servation equations and boundary con- 
ditions, the grids are chosen stag- 
gered so that four cells of a grid 
coincide with one cell of the next- 
coarser grid, see figure 8b. Then the 
equation systems dSfP- = dR^ may be 
defined recursively from the one on 
the finest grid. 

At each cell of a grid H^, the 
first-variation equation is assumed 
to be known and to have the form: 


nT 

V 

dF = dR. . 

( 37 ) 


H 

C_I 

ar'*" 

= dF - dF + dF 

, (38) 

dp" 

^n „n, n 
= P V dcp , 

(39) 


^n „n, n 
= A V dq) , 

(Uo) 


where, on the finest grid (n = m) , 
the dR? . are residuals of (j) in ( 1 ) : 




dR"^ . 


^mT 


T,dm 

F 


{h^) 


dF, dF^, and dF^^ are the first- 
variations of F, F^, and F^^ around 
(j) so that, on the finest grid, and 
may be shown to be the 2*2-matrices 

P“ = [p h {G-M2 U (42) 

a“ = 0 if M « 1 else 

[p h (1-M2) (U u'^/q2)]^ . (43) 
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Each equation 

(1+4) 

i.J. i.J- 

of a large cell ^ grid 

(see Fig. 8) is defined from the four 
corresponding equations (37) on grid 
hy requiring that the coarse-grid 
equation should represent a mass-con- 
servation equation for the long-vave- 
length content dq/^”^ of a suitable 
class of correction grid functions dcp^. 
From this requirement, restriction 
rules for the residues and the coeffi- 
cient matrices are readily derived 
with mass-fl\ix considerations. For 
example, the mass flux through the 
face (i. 9 i+^) o.f the large cell (i^j,) in 
figure 8 should he equal to the total 
mass flux of the two corresponding 
faces of the small cells, giving 

(p Vdtp)?"J 1 = (1+5) 

i [ (P Vd4>)“ ^ + (P 3]. 


This should he true for the long-wave- 
length content in dcp^. For the long- 
wavelength content, the three gradients 
in ( 45 ) are about equal: 


(Vdtp)^ ] T s (vdcp)V ^ 
s (Vdtp)^ . ,3 

1 5 


(U6) 


so that an equation for the coefficient 
matrix P on the coarse grid is found: 




coefficient-weighting. Residual 
weighting is also applied: each resi- 
dual dR^-| on the coarse grid turns 
out to he a weighted average of 
the residuals of the four smaller 
cells on the next-finer grid that are 
covered hy the coarse-grid cell (isj,) - 

The linearized forms of the 
Neumann boundary conditions (27,28) 
are restricted to coarser grids in a 
similar way as illustrated hy (45-47). 

When the equation system 
cm - ^pm desired correc- 

tion dcp“ of the potential has been 
restricted to the coarser grids, dcp^ 
is estimated. This is done hy a 
recursive process involving on each 
grid improvement of dcp^ hy line 
relaxation, and subsequent prolonga- 
tion to the next-finer grid hy 
bilinear int eipolati on . The process 
starts on the coarsest grid hy putting 
the initial correction potential dcp^ 
zero. When a dq^^”^ has been prolonged 
to a dcp“^ on the finest grid of the 
grid sequence, dcp*^ is first added to 
the last potential c()^ to a new (f)^; 
this new potential is subsequently 
improved hy nonlinear line relaxation 
over the entire finest grid. This 
nonlinear relaxation on the entire 
finest grid terminates the multigrid 
cycle . 

On each grid, one line-relaxation 
sweep over the grid is sufficient. On 
the coarsest grid , it is desirable 
to make more sweeps, however, to 
obtain a reasonable estimate of dcp^ . 
Four sweeps were found a suitable 
number in applications. 


Similar arguments are used to define 
the other coefficient matrices at the 
cell-face centres on the grid The 

residues are readily restricted by 
applying a discrete version of Gauss’ 
theorem. 

From ( 47 ) and similar formulas it 
follows that the coefficients at the 
cell- face centres are determined with 


5 . Stability 

From numerical experiments, it 
was found that both the application 
of mass-flux-vector splitting as well 
as calculation of gradients, veloci- 
ties, and densities at cell-face 
centres are necessary to obtain good 
stability properties. As far as sta- 
bility at sonic lines and shocks is 
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concerned, much insight may he obtained 
from ( 9- T 3 ) . 

It is also very helpful to analyze 
the structure of the coefficient 
matrices in the first-variation equa- 
tion (3T) of each individual nonlinear 
discrete mass-conservation equation ( 1 ). 
A necessary condition for stability of 
the nonlinear finite-difference equa- 
tion system is stability of each indi- 
vidual first-variation equation (37)j 
because (3T) is an exact linearization 
of ( 1 ) . The last property is a conse- 
quence of the computation of q and p 
from Vcp at cell- face centres (in stead 
of at cell-face corners, as usually is 
done). The stability of each first- 
variation equation (37) depends on the 
eigenvalues of the matrices P^-A^ and 
A^, see (U2,43). It may be shovn that, 
in subsonic flow, is positive defi- 
nite and Pfl- is zero while, in super- 
sonic flow, and both are 

precisely semi-definite , with A^ having 
one negative eigenvalue corresponding 
to the streamline direction U/q, and 
pm_^m having one zero eigenvalue also 
corresponding to the streamline direc- 
tion U/q. This may be concluded from 
the following factorization for super- 
sonic flow of the mass-flux vectors 
8ind matrices (the superscript m for the 
grid is omitted) : 


F-F®" = hH 


= hH~V 


p*q*/q 0 

0 p 

(pq-p*q*)/q 0 


T -IT 
V H Vip , 


T -It 
V H Vcp , 


-1 

P-A = hH V 


A = hH’V 


0 0 

0 p J 

p(l - M^) 0 

0 0 


-T.r-lT 


V H 


T -IT 
V H 


(48-51) 


where V is the orthornormal matrix 
(8), and H is the Jacobian of the 
mapping from the computational to the 
physical plane, so that G = 

Mass-flux- vector splitting leads thus 
to an exact separation of the posi- 
tive and negative eigenvalues of the 
matrix P associated with the first- 
vatriation dF = P Vdcp of the mass-flux 
vector F = p h U. The positive eigen- 
value in the part P-A suggests cen- 
tral differencing for F-F®- and its 
first variation dF-dF^, the negative 
eigenvalue in the part A suggests 
upstream differencing for F^^ and 

The factorizations follow 
directly from the relation 

rn m „ IT* 

[tp 0] = V H Vcp , 

S 

(this follows from the chain rule for 
differentiation and from cpj^ = O), so 
that, together with (4,8), we obtain 

u/q = H"% t 1 0 ] , (52) 


L 0 0 J 

(53) 

It will be seen from the results 
to be presented that approximately 
normal shocks are very steep. Detailed 
analysis of shock operators shows that 
this is a direct consequence of the 
eigenvalue of P-A corresponding to 
the streamline direction being zero, 
while F-F^ has a constant sonic 
streamline component ; see ( 1 1 ) , and 
(48,52). The last property means 
that the central-difference part of 
the mass -conservation equation ( 1 ) of 
cells just ahead of approximately 
normal shocks are independent of the 
large cp^g in the shock. On fine grids, 
the Jameson artificial viscosity has 
also this property, if the coeffi- 
cients in this viscosity are evaluated 
at cell centres . 

Mass- flux-vector splitting as 
presented here has been extensively 


U U-" / q2 = H' 


_1 
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tested numerically 5 and vas found use- 
ful for approximately normal shocks 
because the matrix U in the mass- 

flux vector is the image in the com- 
putational plane of a unit vector along 
the streamline in the physical plane, 
see ( 52 ). It may be expected that steep 
oblique shocks -wrill at least require 
replacement of U u'^/q^ by some matrix 
U iP', where U is the image in the com- 
putational plane of a unit vector 
approximately normal to the oblique 
shock. 

The definiteness properties of the 
matrices P^-A^ and are used for the 
design of diagonally- dominant tridiago- 
nal matrices to be applied in line- 
relaxations, see section 6 belov. The 
definiteness properties are transferred 
to coarser grids by the simple restric- 
tion rules of the form (^T)» so that on 
coarser grids these properties are 
easily traced. This is important for 
the convergence of relaxations on the 
coarser grids. 

6. Line relaxation 

In each multigrid relaxation sweep 
and on each grid of the grid 
sequence, an approximation of the solu- 
tion the first-variation equation 
system (36): dcp*^ = dR^ or of the 

nonlinear equation system (31): 
Ln((pP(ra)) = Qn is improved by one or a 
few line-relaxation sweeps over the 
entire grid. Relaxation in downstream 
direction is applied; a sweep over the 
lower part of the aerofoil is followed 
by a sweep over the upper part . 

Tridiagonal equation systems to be 
used in line relaxation sweeps may be 
derived in various ways . For example , 
Jameson used an analysis of a pseudo- 
time-dependent process to derive these 
relaxation equations (Ref. 23). However, 
the relaxation equations may also be 
derived directly from the first-varia- 
tion equations by considering relaxa- 
tion on each individual grid line as a 
crude Newton iteration step at that 


grid line. 

The derivation of the relaxation 
equation system of each i-line starts 
thus with the assumption that for the 
potential values on the i-line a cor- 
rection problem has to be solved. We 
may put on the entire grid: 

dc“ = d(p“-d())“ or dC*^ = , 

( 54 ) 

where d((>^ or (f)^ are given estimates of 
potentials, and dC^ is the correction 
to be computed from a relaxation 
equation system. Initially, this 
system has the same form as the first- 
variation equation (37-^0), with the 
mass -flux vectors dF^, dF^^, and 
dF^rn ^^qw depending on dC^ in stead 
of on dcp*^: 



= 

V^dC^ 


(55) 


= 

v’^dc” 

3 

( 56 ) 


while the right-hand side is replaced 
by the residual of dtj)^ or <j)^ in (37) 
or in ( 1 ). This equation system is 
subsequently crudely approximated to 
a simple relaxation system for the 
calculation of dCi^j-values on line i, 
with a tridiagonal diagonally- dominant 
coefficient matrix. The approximation 
process consists of the following 
steps . 

• Finite-difference formulas with 
asymptotic scaling are replaced by the 
usual difference formulas. 

• Second-order cross-differences 

are removed by zeroing the off-diagonal 
elements in the matrices P^-A^ and A^. 
The only differences that remain are 
those of and dC^ multiplied by 

diagonal elements of P^-A^ and A^ 
with a known sign (see the discussion 
in section 5 about the definiteness 
properties of P^-A^ and A^). 

• Retarded fluxes dF^^^ representing 
inflow into a cell are zeroed. This 
simulates , for each cell in the super- 
sonic zone, a zero initial condition 
if the calculation of dC? .-values on 

1 5tJ 
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line i is considered to te an isolated 
subprotlem. 

^ Corrections dCj^^^ at points 

(i+a,j+g) not yet updated in the cur- 
rent sw“eep are zeroed. This simulates a 
Dirichlet boundary condition in the 
subproblem. 

The resulting tridiagonal system is 
augmented by a formula for the improve- 
ment of the Neumann boundary condition, 
derived by linearizing the nonlinear 
condition ( 28 ), and (crudely) approxi- 
mated by one-sided differences at the 
point (i,J) in such a way that 
diagonal-dominance is preserved. 

The tridiagonal equation system 
derived in this way from the linear 
first-variation equations turns out to 
be practical identical to that of 
Jameson (Refs 23,3,5) if the flow is 
subsonic or supersonic, however, with- 
out Jameson subsonic or supersonic 
relaxation factors. At sonic lines and 
shocks, a comparison with Jameson* s 
formulas was not possible because of 
lack of published results. The relaxa- 
tion equation of sonic or shock cells 
turns out to be different from those 
elsewhere in the flow if they are 
derived from the first-variation equa- 
tion. 

Relaxation factors were not used 
in the calculation results presented 
below, except at sonic cells where 
Tinder relaxation was applied. 

T. Results of numerical experiments 

From a large number of numerical 
experiments, a number of cases have 
been selected. This selection permits 
a separate analysis of the effect of 
circulation changes, grid changes, 
shock-position variations, etc. 

All results presented were produced 
by calculations made on three succes- 
sive grids of size 3^*10, 66*18, 

130*3^, which are numbered 2, 3, and U. 
Each 130*3^ grid is similar to that of 
figure 3. Multigrid sweeps for the cal- 
culations on grid 2 used two grids 
levels, multigrid sweeps for grid 3 


three, and multigrid sweeps for grid 
4 used four grid levels. 

Figure 9 concerns the flow around 
a symmetrical 12.8 % thick Karman- 
Trefftz aerofoil atM =0,a =0 

(linear problem, no circulation). Two 
conclusions follow from the conver- 
gence history. 

• The multigrid convergence rate is 
good (about 0.6 per multigrid cycle). 

^ The residual norm increases when a 
solution is prolonged to a next- 
finer grid to serve as starting 
solution. This is due to a poor 
resolution of the coarser grids at 
the leading edge, as will be evident 
from figure 3 when U or l6 cells of 
the finest grid are grouped together 
to one cell of a coarser grid. 

Results of the incompressible 
flow around the same Karman-Trefftz 
aerofoil, at an incidence of 10^, 
are presented in figure 10. It is 
seen that : 

• changes in circulation, in particu- 
lar large changes , may lead to large 
increases of residual norms. This is 
due to large changes of the solution 
at the leading edge. 

^ the grids 2 and 3 are too coarse at 
the leading edge to permit a reason- 
ably accurate calculation of the 
expansion of the flow at the 
leading edge. 

^ the multigrid convergence rate is 
good (about 0.6 per multigrid cycle). 

Results for a high-subsonic flow 
are presented in figure 1 1 
(NACA0012, = 0.63, = 2°). The 

addition of subsonic nonlinearity 
does not lead to new conclusions. 

The added complication of a 
shock in the calculation process is 
first considered for a nonlifting 
case with a moderate shock (NACA0012, 
M^ - 0.8, = 0). Results are pre- 

sented in figure 12. New conclusions 
are the following. 

• After each multigrid-relaxation 
cycle, and on all but the coarsest 
grid 2, the maximum norm of the 
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residuals, (33), is usually found to be 
increased by an order of magnitude. 

This annoying behaviour of the maximum 
norm is due to velocity overshoots or 
undershoots at shocks, as discussed in 
section 3, and illustrated in figure 6. 
A typical example of a velocity over- 
shoot is presented in figure 12d. (This 
figure is the result of a somewhat dif- 
ferent algorithm not presented in this 
report; the velocity overshoot effect 
is representative, however.) As dis- 
cussed, the velocity overshoot is due 
to a tendency of multigrid-relaxation 
cycles to keep the shock position 
fixed. See aJ_so Jameson’s remark in 
reference 11, page 125 about "the 
appearance ahead of the shock of a 
temporary overshoot", and the corre- 
sponding flat segment in the conver- 
gence history in his figure 2b. This 
implies that there must be large 
residuals in small zones keeping his 
average-absolute-value norm temporarily 
about constant . 

• The velocity overshoots can be 
reliably transformed with partial 
relaxation sweeps on the current finest 
grid to appropriate shock displace- 
ments: partial relaxation usually 
reduces the residual norm considerably. 

• The lack of resolution at the lead- 
ing edge on the coarser grids leads to 
too small flow expansion over the 
leading edge and to too forward shock 
positions. It may be expected that 
improvement of the resolution at the 
leading edge on grids 2 and 3 will lead 
to better pressure distributions so 
that a smaller calculation effort to 
improve shock positions is required 
(see Holst and Brown, Ref. 2U). 

A transonic case with lift is 
presented in figure 13 (NACA0012, 

M = O.T5i ot = 2^). This case has also 
been computed in references 7 and 1 1 . 
There are no important new points to be 
observed. The peak Mach number ahead 
of the shock is 1.37- Hence, for prac- 
tical purposes, the shock is fairly 
strong. The shock covers obviously two 
cells; this is always true. 


Central processor times of the 
research code used for the numerical 
experiments are presented in table 1 . 
These times were measured on the NLR 
Cyber 73-28 computer. The table illu- 
strates that the algorithm is effi- 
cient for subsonic flows. For 
transonic flows, the computation 

(Mo^,ao^) GRID NBR OF NBR OF CP-S TOTAL 

MGR PART. PER CP-S 

CYCLES RELAX. GRID 

SWEEPS 


(.63,0) 

2 

14 

0 

31 



3 

10 

0 

74 



1+ 

8 

0 

219 

324 

(.8,0) 

2 

8 

14 

23 



3 

5 

30 

64 



1+ 

10 

122 

680 

767 

(.63,2°) 

2 

21 

29 

58 



3 

23 

68 

231 



k 

25 

278 

1544 

1833 

times are 

too 

long. 

This 

is primarily 


due to the partial relaxation sweeps 
used to update shock positions. A 
continued search for improved shock- 
position update algorithms will be 
required. 

Results of computations with 
various fozrnis of artificial-viscosity 
terms instead of split mass-flux- 
vectors are omitted. We found all 
artificial viscosity-terms tested to 
have poor stability properties at 
sonic lines and/or shocks, in parti- 
cular on coarse grids, and also at 
the tops of supersonic zones when 
corrections were large. This was due 
to the fact that the viscosity terms 
did not deliberately exclude expan- 
sion shocks. 

8. Conclusions 

From the results presented in 
this study it is evident that the 
introduction of multi grid methods in 
transonic potential-flow calculations 

is not a simple matter. A number of 
conclusions are clear from the 
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present study, however. 

• Circulation changes give usually rise 
to an increase of residual norms 
(section T). 
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bility at shocks and sonic lines. This 
requires difference formulas with 
excellent stability properties. Such 
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• It will be hard to improve shock-wave 
positions by multigrid relaxation pro- 
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a linear or weakly nonlinear (FAS) cor- 
rection process, while shock-wave dis- 
placement processes are highly non- 
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INTRODUCTION 


Two computational approaches which achieved substantial popularity 
during the past decade are spectral methods and multi-grid techniques. The 
former have proven highly efficient for time -dependent, smooth flows in 
simple geometries (refs. 1-3). The latter have been remarkably successful 
for elliptic equations and some steady-state calculations (refs. 4-7). The 
principal advantage of spectral methods lies in their ability to achieve 
accurate results with substantially fewer grid points than required by 
typical finite difference methods. Despite the fact that spectral methods 
are represented by full matrices, explicit time-stepping algorithms can be 
implemented nearly as efficiently for them as for finite difference methods 
on a comparable grid. Transform methods (ref. 8) are often the key to this 
efficiency. For implicit methods or for steady-state equations, direct 
solution of the spectral equations is not practical in general. Iterative 
schemes for such equations are essential. Orszag (ref. 9) has described 
several attractive methods. 

This paper examines an alternative approach which employs multi-grid 
concepts in the iterative solution of spectral equations. In particular, 
spectral multi-grid methods are described for self-adjoint elliptic equations 
with either periodic or Dirichlet boundary conditions. For realistic fluid 
calculations the relevant boundary conditions are likely to be periodic in at 
least one (angular) coordinate and Dirichlet (or Neumann) in the remaining 
coordinates. Spectral methods may not always be effective for flows in 
strictly rectangular geometries since corners generally Introduce 
singularities into the solution. These singularities can seriously degrade 
the accuracy of a spectral method. If the boundary is smooth, then mapping 
techniques (ref. 9) can often be used to transform the problem into one with 
a combination of periodic and Dirichlet boundary conditions. Spectral multi- 
grid methods in these geometries can be devised by combining the techniques 
presented separately here. 


Research of TAZ was supported by NASA Grant NAGl-109; research of YSW and MYH 
was supported by NASA Contract Nos. NASl-15810 and NASl-16394, respectively, 
while they were in residence at ICASE, NASA Langley Research Center. 
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SYMBOLS 


A diagonal matrix of PDE coefficients at collocation points 

a variable coefficient in PDE 

B diagonal matrix of PDE coefficients at collocation points 

b variable coefficient in PDE 

C matrix representing Fourier transform 

c constants used in describing the discrete cosine transform 

D matrix representing first derivative operator in transform space 

E matrix describing trigonometric interpolation in transform space 

F right-hand-side terms of PDE at collocation points 

f right-hand-side term of PDE 

G grid on level k 

H pre-conditioning matrix 

K finest level of the multiple grids 

k any grid (or level) k of multiple grids 

L matrix representing spectral approximation to PDE operator 

I lower-triangular matrix 

M matrix representing first derivative operator in physical space 

M vector used for describing M 

N number of collocation points (in one coordinate direction) 

n number of distinct relaxation parameters 

P matrix representing coarse-to-fine grid interpolation 

R matrix representing fine-to-coarse grid interpolation 

S matrix representing finite difference approximation to PDE 
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T Chebyshev polynomial of degree n 

U vector of solution at collocation points 

a upper-triangular matrix 

u solution to PDE 

u Fourier transform of solution to PDE 

V vector of corrections in multi-grid scheme 

V approximate solution to V 

x>y physical space coordinates 

6. - Kronecker delta function 

a.i 

e amplitude in variable coefficient term 

X eigenvalue 

5 eigenvector 

p spectral radius 

03 relaxation parameter 

y smoothing rate 

y average smoothing rate 


PERIODIC PROBLEMS 


Fourier Spectral Approximations 

Several types of spectral approximations can be employed. The specific 
method used here is often termed collocation or pseudo-spectral 
approximation. In many cases this method is easier to implement and is more 
efficient than the alternative Galerkin and tau approximations. A thorough 
discussion of all these methods can be found in reference 3* 

For a periodic problem spectral approximations should be based upon 
Fourier series. In the collocation approach the fundamental representation 
of the solution remains in physical space. The Fourier coefficients are only 
employed as an Intermediate result in the approximate evaluation of 
derivatives. Consider a function u(x) which is periodic over the Interval 
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[0,2 7t] . Use N evenly spaced collocation points 

3 =0 9 1 » • • • ,N— 1 


Xj = 27TJ/N 


( 1 ) 


The first step in the evaluation of du/dx is 


and denote ‘ 

the computation of the approximate Fourier coefficients via 


N-1 


= ( 1 /^) V u. e 

P Z-f 3 

j=0 


p=-N/2,-N/2+l, •••,N/2-l • 


( 2 ) 


Since the u(3^ ) are real, ^-n/ 2 real and u-p= Up for (p( < N/2 
where the * denotes complex conjugation. The derivative is then computed 
via 


^-1 

2 ^ 


du/dx(x.) = (1/v^) ^ ipu e^^^j j=0, 1 , . . . ,N-1 • 

•J -kT ^ 


(3) 


= -ffl 


Both sums can be evaluated in 0(N In N) operations by the Fast Fourier 
Transform (ref. 10). This algo^-ithm is most commonly employed with N 
chosen as a power of 2 . 


Note that the lower limit on the sum in equation (3) is not p = ~N/2 


but p = -N/2 + 1 . This change is equivalent to setting ^ ’ 

right-hand-side of equation (3) is necessarily real. The neglected term 


The 


-i (N/2) e 




is purely imaginary and cannot contribute to du/dx(xj) • This neglected 
term represents the familiar "two-point oscillation" in u(x) . (Finite 
difference schemes which use central differences for first derivatives also 
remove the two-point oscillation.) 

The spectral evaluation of derivatives has a convenient matrix 
representation. Let U denote the vector of the solution at the grid, or 
collocation, points, i.e., 


U = (u ,u , 
0 1 


u ) 

N-r 


(4) 


let C represent the discrete Fourier transform, i.e.. 


. N , 


= (1//N) (j-2)/N 


j, 1=0,1,..., N-1 


(5) 


and let D be the diagonal matrix which represents the first derivative in 
Fourier space, i.e.. 
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for j = 1,2, 
for j = 0 . 

Note that C C , the Hemitlan transpose of C • Then the matrix 

M = C^^DC 




l(j - N/2) 


0 


( 6 ) 


(7) 


represents (in physical space) the spectral evaluation of a first derivative. 
This matrix is given explicitly by 

- Mj.i , (8) 

where 


- i ° 

“ i 

(cos (l-l/N)frj/(2 sin(Trj/N)) 


J=0,±N,±2N, . . . 
Otherwise • 


(9) 


A spectral approximation to the ordinary differential equation 

(d/dx){a(x) du/dx} = f(x) (10) 

on t0,27T] with periodic boundary conditions, and with a(x) and f(x) 

infinitely differentiable as well as periodic, satisfies the discrete 
equation 

L U = F , (11) 

where 


L » MAM , (12) 

Aji = a(xj) 6j,i (13) 

and 

F = (fo.fj. • . • • (14) 

Equation (11) may be inverted to yield 

U = (C“^D~*GA“^c“^d“^C) F . (15) 

Although the matrix D is technically singular, this merely reflects the 
usual non-uniqueness of the solution of equation (10). All of the matrix 
multiplies required by the right-hand-side of equation (15) may be 
implemented efficiently. There are three diagonal matrices and four Fourier 
transforms. Thus, the solution to equation (11) can be obtained directly in 
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0(N In N) operations, even though the matrix L is full 


Unfortunately, efficient direct solutions are not available in higher 
dimensions. Consider the self-adjoint elliptic equation 


{a(x.y)-g ) + ^ {b(x,y)|^ } - f 


(16) 


on the square [0,2tt] x [0,2'H] . Again impose periodic boundary conditions 

and assume that the functions a , b and f are also periodic as well as 
infinitely differentiable. A spectral approximation to equation (16) will 
exhibit exponential convergence, i.e. , the error will ultimately decrease 
faster than any finite Inverse power of the number of collocation points. 

For simplicity, suppose that an N x N mesh is employed. Define the 
approximate solution 

= u(xj, for N-1 . (17) 

Define F in a similar fashion and let A and B be the diagonal matrices 
representing a(x,y) and b(x,y) , respectively, in the manner of equation 
(13). The discrete approximation to equation (16) is 

L U = F , (18) 


where the fourth-order tensor 

L = (M(^I)A(M®I) + (I®M)B(I0M) , (19) 

with © denoting a tensor product and I representing the identity matrix 
of order N • 

The authors are unaware of any efficient method for solving equation 
(18) directly. The iterative methods described in reference 9 are one 
possible solution scheme. A different sort of iterative method — one 

involving the use of multiple grids — is described below. 


Euler Iteration on a Single Grid 

The direct solution of the x system represented by equation (18) 
would require O(N^) storage locations and 0(NS) operations. Many 
iterative schemes require only 0(N 2) storage locations and 0(N2 In N) 
operations per step. Perhaps the simplest iterative scheme is the Euler 
method 


U <— U - 60 (F - L U) , 


( 20 ) 
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where w is a relaxation parameter. Aside from the coefficients aCxj^yj^) 
and b(x-,y 2 ^) , the only substantial storage required is for the residual 
[the term in parentheses in eq. (20)], which is clearly O(N^) • The tensor 
L is never explicitly required. The residual itself costs 0(N^ In N) 
operations to compute. Jacobi's method (see below) is also economical in 

storage and cost per step. Not all iterative schemes used to solve finite 
deference equations are practical for the spectral equations, however. 
Gauss-Seidel is an obvious example. The term L U can only be evaluated 
efficiently if it is done all at once. 

It is instructive to consider the application of the Euler iteration to 
the constant coefficient case a(x,y) = b(x,y) = 1 . The tensor L 

simplifies to 

L = M^0I + I©M^ . (21) 

The eigenvalues and eigenvectors of L are 

Xpq = - (P^ + q^) (22) 

^(Pj + ql) 

Cji<P.q) = e “ , (23) 

where the eigenvalues and eigenvectors are labelled by p and q which lie 
in the range p,q = -N/2, -N/2+1, ... , N/2-1 . In equation (22), if either 
p or q = “N/2, then that term should be replaced by 0 on the right-hand- 
side. A single iteration by equation (20) replaces the error component 
with (1 + coXpq ) 5(p,q) • There are two eigenvectors which are 
unaffected by the iteration. One of these — for p = q = 0 — represents 
the mean level of the solution. It must be specified for the partial 
differential equation to have a unique answer. The other term — for p == q 
*= -N/2 -- represents the high-frequency component that is Ignored by the 
discretization. This component should be filtered out of the right-hand-side 
F . 


This scheme is convergent if 

0) < -2/Aj^ N “ ^/(N-2) ^ • 

2 - l»-2 

The smallest spectral radius 

p = (N^ - 4N + 2)/(N^ - 4N + 6) = 1 - 4/N^ , 


(24) 


(25) 


is obtained when 


0) = 4/(N^ -4N + 6) . (26) 

According to the usual reasoning, equation (25) implies 0(N ) iterations are 
required. This means a total of 0(N^ In N) operations are required in 
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order to solve equation (18) in this fashion. 

Euler Iteration Using Multiple Grids 

Multi-grid methods have become a standard means of accelerating 
convergence for finite difference and finite element discretizations of 
elliptic equations. The basic processes are the relaxation scheme and the 
transfer of residuals and corrections between the various grids. In addition 
to specific choices of relaxation and interpolation procedures, a multi-grid 
algorithm must give rules governing the transfer between grids. A variety of 
control structures for this latter process have been employed. For examples 
of some of the control structures, see the flow charts in reference 5. The 
present discussion will focus on the relaxation and interpolation procedures, 
since they are less arbitrary than the control structure. Moreover, the 
description will be given for the spectral discretization of the one- 
dimensional problem [eqs. (11) and (12)]. This is done simply for notational 
convenience. The performance will be assessed, and numerical examples given, 
however, for the two-dimensional case. 

Define a series of grids (or levels) , for k = 2, 3, ... , K 

covering the interval [0,2tt] . Let G^ consist of uniformly spaced 

points, where N]^ = 2^ . The solution to equation (11) is obtained by 
combining Euler Iterations on level K with Euler iterations for related 
problems on the coarser levels k < K . Denote the relevant discrete problem 
at any level k by 

(27) 

On the finest level K , = L , = F and the solution = U , the 

solution to equation (11)* At any stage in the iterative solution process 
for equation (27), only an approximation v^^ to the exact answer is 

available. If this approximation is deemed adequate, then the approximation 
on the next -finer level k+1 is corrected via 

^k+l ^k+l + Pk+f'k • <28) 

The matrix represents the coarse-to-fine transfer of corrections from 

level k-1 to level k . On the other hand, if the approximation v^ is 
deemed inadequate, either another relaxation is performed, via 

^k <— ^k - V^k " ^k^k^ » <29) 

or else control shifts to a problem on the next -coarser level k-1 . The 
relaxation parameter on level k is chosen to damp preferentially those 

error components which are not represented on coarser grids. The right-hand- 
side of the coarser grid problem is obtained from 

^k-l = *^k<^k “ I-kyk) • <30) 
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The matrix represents the fine-to-coarse residual transfer from level k 

to level k-1^. 

For the spectral multi-grid method the natural interpolation operators 
represent trigonometric rather than polynomial interpolation. For the one- 
dimensional case, 

-1 

®^k “ ^k-l®k-l^k » ^31) 

Pk = , (32) 

where the N, x N, . , matrix 
k k+1 

^k I Ik I (33) 

/ T 

(with the identity matrix of order is its transpose, and 

is the matrix given in equation (5) for N = matrix E 

represents the dropping of the high-frequency Fourier coefficients in the 
trigonometric interpolation from the fine grid to the coarse grid. Note that 
?k ® • The generalization to higher dimensions is straightforward. 

For the constant coefficient, one-dimensional case, the finest grid 
relaxation operator 

Lg. = , (34) 

and it is natural to use 

» 5 ) 

for k < K . It is easy to show that 

L^_l = RkL^Pk * 

The description of the variable coefficient relaxation operator is more 
complicated and the details will be published elsewhere. The procedure used 
in the numerical experiments reported below amounts to performing the 
collocation operations in an alias-free fashion. 

For the two-dimensional Poisson equation discussed in the previous 
section the level k relaxation parameter is chosen to maximize the 

smoothing of all the modes except those for which |p|, |q| < • 

“ 2/((9/16)nJ- 2Nj^+ 2) . (37) 

This choice produces a smoothing rate for the high-frequency modes of 

VI = 1 - 2n2/(9n2 - 32N + 32) . (38) 

Jc 1C JC 1C 
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This smoothing rate is listed in Table 1 alongside the spectral radius for 
the single grid Euler method. The advantage of multiple-grlddlng is 
apparent. For large ^ * Thus, according to the usual multi- 
grid argument, the number of iterations needed to obtain a given reduction in 
the residual should be independent of the number of grid points on the finest 
grid. This assumes, of course, no untoward effects of the interpolation 
process. But the trigonometric interpolation procedure used here is ideally 
suited to minimize the spurious generation of high-frequency components at 
these stages. 


TABLE 1. Convergence Rates for Euler Iteration in Two-dimensions 


— 

N 

single grid 
spectral radius 

multi-grid 
smoothing rate 

4 

0.3333 

0.3333 

8 

0.8947 

0.6364 

16 

0.9798 

0.7193 

32 

0.9956 

0.7510 

64 

0.9990 

0.7649 

OO 

1.0000 

0.7778 


Alternatives to Euler Relaxation 


A straightforward improvement upon the simple relaxation scheme 
described in the preceding sub-section is to make it non-stationary . This 
approach has been used for accelerating point-Jacobi iterations for finite 
difference multi-grid algorithms (see ref. 11). The non-stationary Euler 
iteration consists of using n relaxation parameters ^ 2 * *** * ^n 
in a cyclic fashion on each level k • These parameters are de’termined from 
the solution of a standard mlnimax problem over the interval covered by the 
high-frequency eigenvalues. 


For the two-dimensional Poisson equation, this eigenvalue range is 
from -(Nj^/4) to -(Nj^/2-1) . The results are only changed slightly if 
the upper limit of this range is changed to -(N /2)^ . Then the optimal 
parameters are given by 

= (32/Nj^^)/( 7 cos(j-l/2) TT/n + 9) (39) 

and the total smoothing of the high-frequencies after the full n 
relaxations is l/|T^(-9/7)| , where Tj^(x) is the Chebyshev polynomial of 
degree n . Then the effective smoothing rate 

yfc “ l/|Tn(-9/7) I , (40) 
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which is the average smoothing per single step in the cyclic relaxation. The 
values are given in Table 2 along with the corresponding effective smoothing 
rates for a finite difference multi-grid method which also is relaxed with 
Euler iteration. The spectral smoothing rates are larger than the finite 
difference ones because the ratio of the largest high-frequency eigenvalue to 
the smallest high-frequency eigenvalue is 8 in the former case and only 4 
in the latter. This ratio may be termed the multi-grid condition number. 
The higher smoothing rate for the spectral method suggests that a larger 
number of distinct relaxation parameters should be used here than for the 
finite difference case. 


TABLE 2. Smoothing Rates for Euler Iteration on Poisson's Equation 


number of 
parameters 

spectral 
smoothing rate 

finite difference 
smoothing rate 

1 

0.7778 

0.6000 

2 

0.6585 

0.4685 

3 

0.5995 

0.4198 

4 

0.5676 

0.3964 

5 

0.5485 

0.3749 


It should be kept in mind that this larger eigenvalue ratio for the 
spectral method occurs because this method represents the larger eigenvalues 
of the partial differential equation much better than finite difference 
methods. Indeed, it is just this property which is responsible for the 
exponential convergence rate of spectral methods as N is increased and for 
their low phase-error in time-dependent calculations. 

Another obvious relaxation scheme is point-Jacobi. The actual 
Implementation of this method requires that the diagonal elements of the 
matrix L be known explicitly. Consider the one-dimensional situation, 
where L is given be equation (12) for the general case. It would appear 
that the evaluation of the elements Ljj requires O(N^) operations. This 
would be Impractical since the results of the previous section suggested that 
only 0(N In N) operations are needed to get the solution itself. 

Nonetheless, Jacobi relaxation is worth considering since transform 
methods may be employed to compute the requisite diagonal elements^ in only 
0(N In N) operations. It is clear from equation (9) that Mj is odd in j • 
Thus , 

L.. = - 

JJ 1=0 J J- 

But this is a convolution sum and may be evaluated efficiently by the 
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transform methods described In reference 8- Therefore, even for non-linear 
problems, Jacobi relaxation may be iiiplemented efficiently. 


Numerical Examples 


The spectral multi-grid method was Implemented for the two-dimensional 
problem [eq. (16)] for which the coefficients 


a(x,y) = b(x,y) 


1 + e e 


cos(x+y) 


(42) 


and the solution itself 

u(x,y) =5 sin(TTcos x +tt/4) sin( ttcos y + tt/4) . (43) 

The Fourier coefficients of this function may be expressed in terms of Bessel 
functions. Reference 3 (pp« 35-37) uses this function to illustrate 
exponential convergence. The term ^4 serves to make all the Fourier 
coefficients non-zero. The constant £ in equation (42) measures the 
departure of the equation from the strictly Poisson form. 

A simple control structure was selected for the multi-grid algorithm: 
start on the finest level and relax once on each level in turn until the 
coarsest level k=:2 ; there Iterate until convergence; then work back up to 
the finest level, relaxing once more on each intermediate level. This 
process is repeated until the desired accuracy is achieved. This algorithm 
requires more frequent interpolation but is less arbitrary than many 
alternatives. Despite the necessity for employing the Fast Fourier Transform 
in the trigonometric interpolations, this portion of the computations takes 
less than 10% of the total computation time. 


TABLE 3. RMS Residuals for Fourier Spectral Multi-grid 
Using Stationary Euler Iteration 


relaxation 

number 

£ = 0.0 

e = 0.1 

e = 0.2 

3 

2.92 (1) 

3.23 (0) 

3.72 (0) 

6 

2.27 (-1) 

2.49 (-1) 

3.12 (-1) 

9 

3.24 (-2) 

3.52 (-2) 

4.40 (-2) 

12 

1.02 (-2) 

1.11 (-2) 

1.37 (-2) 

15 

4.00 (-3) 

4.37 (-3) 

5.55 (-3) 


The results of calculations for which the finest level K = 5 are shown 
in Tables 3 and 4. The non-statlonary Euler iteration used 3 distinct 
parameters. The transfer between grids does not occur until all 3 
relaxations have been performed. The residuals are listed in the tables 


184 




after every 3 relaxations on the finest grid* The number in parentheses is 
the exponent of the residual. For comparison purposes note that Euler 
iteration on a single grid exhibits a residual of about 10 after 15 
relaxations. The multi-grid results are a marked improvement. 


TABLE 4. RMS Residuals for Fourier Spectral Multi-grid 
Using Non-s tat ionary Euler Iteration 


relaxation 

number 

£ = 0.0 

e = 0.1 

e = 0.2 

3 

2.82 (1) 

3.12 (1) 

3.47 (1) 

6 

2.42 (-1) 

2.10 (-1) 

2.56 (-1) 

9 

6.35 (-3) 

3.68 (-3) 

5.57 (-3) 

12 

f 4.56 (-4) 

3.19 (-4) 

6.30 (-4) 1 

15 

8.30 (-5) 

5.36 (-5) 

1.17 (-5) j 


On a 32 X 32 grid the true solution of the Fourier collocation equation 
(18) has an RMS error of 5.08 (-10) compared with the exact solution of 
equation (43) for e * 0.0 • The RMS error of the non-s tat Iona ry iteration 
after 15 fine-grid relaxations is 2.20 (-7) . To get the full accuracy 
out of a spectral method it may be necessary to reduce the residual by many 
orders of magnitude. By contrast a second-order finite difference 
approximation on a 32 x 32 grid gives an RMS error of 7.64 (-2) for 
the e = 0.0 problem. Even a fourth-order method gives only 5.04 (-3) 

For this problem, at least, it seems worthwhile to accept the less 
advantageous smoothing rate of the spectral multi-grid method (see Table 2), 
since a far smaller grid can be used than for a finite difference method. 


DIRICHLET PROBLEMS 
Chebyshev Spectral Approximations 

For problems with Dirichlet (or Neumann) boundary conditions, spectral 
approximations should be based upon Chebyshev series. The standard interval 
is [-1,1] and the collocation points are 

Xj = cos(27Tj/N) j=0,l, ... ,N . (44) 

The analog to equation (7) with Dirichlet boundary conditions may be written 
in the form of equations (11) -(14) where now 


j,l«0,l, ... , N , (45) 


* (2/NEjCi) cos(7Tjl/N) 
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c. 

J 


2 

1 


j=0 or j*N 

l<j<N , 


(46) 


and 


21/cj l^j+1 and l»j+l (mod 2) 

0 otherwise , 

(2 j=0 

(1 J>1 • 


(47) 


(48) 


Reference 3 is a good source for many details about Chebyshev collocation. 
The matrix M which represents the Chebyshev approximation to a first 
derivative is again given by equation (7) where now 

+ M^_i)/(c^ sin(TTj/N)) for l<j<N-l 

^00 “ ““nN = l)/6 (49) 

= 2(-l)V(l - cos(tt1/N)) for 1<KN-1, 

where 

( 0 J=0, ± 2N, ±4N, . . . 

( (1/2) (-1) cot('irj/N) otherwise . 

Once more M is a full matrix but the product M U can be evaluated in 
0(N In N) operations. 


Pre-conditioned Euler Iteration Using Multiple Grids 

The direct analog of the Euler iteration method described in the 
preceding section is not practical for the Dirichlet problem. The difficulty 
is that for the Chebyshev second derivative operator the multi-grid condition 
number grows as N • In the one-dimensional case Gershgorin's Theorem can 
be used to show that the largest eigenvalue grows as (ref. 3). All but 
the several largest eigenvalues are good approximations to the eigenvalues of 
the continuous problem. Thus, the smallest high-frequency eigenvalue grows 
as • (Direct numerical computation of the eigenvalues supports these 
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conclusions.) Since the ratio of these two eigenvalues (the multi-grid 
condition number) is , the smoothing rate of a straightforward Chebyshev 
Euler multi-grid method is of the same order as the spectral radius of the 
Fourier Euler iteration on a single grid (see Table 1). The non-statlonary 
Chebyshev Euler multi-grid method has the same problem. 


Clearly, pre-conditioning is essential for an effective Chebyshev 
spectral multi-grid algorithm based on Euler iteration. Thus, in place of 
equation (20) the relaxation scheme is 

U < — U - 0) H"^(F - L U) , (51) 


where the pre-conditioning matrix is denoted by H . An effective pre- 
conditioning matrix has been devised by Orszag (ref. 9) for finding solutions 
iteratively on a single grid to Chebyshev spectral approximations. That pre- 
conditioning matrix, denoted here by S , is a full finite difference 
approximation to the spectral matrix L • Orszag noted that the conventional 
condition number of the matrix should be about 2.4 regardless of N . 


The pre-conditioning matrix employed in the present spectral multi-grid 
calculations is a cheaper but less precise version of S • Instead of using 
S Itself an approximate lower-triangular /upper-triangular decomposition of 
S is used as H , i.e. , 


H = Lu , (52) 

where script letters are used to denote the lower-triangular ( i) and upper- 
triangular (a) factors. This matrix H is cheaper to employ than S 
because can be found by simple forward- and back-substitutions, whereas 

finding S“^ amounts to computing the solution to a finite difference 

discretization of the problem. 

To determine H one starts with S as a standard finite difference 
approximation to equation (16) on the non-uniform grid of the Chebyshev 

collocation points. The matrices L and a are determined by the row sum 
agreement factorization which enforces the following conditions: 

(1) L and u have non -zero elements only on those positions which 
correspond to the non-zero elements in the lower- and upper-triangular 
part of S itself. 

(2) Whenever ^ j ^ then == • (The off-diagonal 

elements of'^ H whose locations correspond to the non-zero off-diagonal 
elements of S are set to those values.) 

(3) The row sums of H are the same as those of S . 

For further details on this sort of pre-conditioning see reference 12. 
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TABLE 5. Extreme Eigenvalues of the Pre-conditioned Matrices 


N 

S“ 

smallest 

1l 

largest 

smallest 

largest 

4 

1.000 

1.757 

1.037 

1.781 

8 

1.000 

2.131 

1.061 

2.877 

16 

1.000 

2.305 

1.043 

4.241 

32 

1.000 

2.363 

1.034 

5.379 


The decreased accuracy of the matrix H is indicated in Table 5, which 
lists the smallest and largest eigenvalues of the pre-conditioned matrix 
H”'^L • In contrast to the matrix S"^L , for which the largest eigenvalue is 
roughly 2.4 , the largest eigenvalue here shows a slow growth with N , 
evidently increasing as /W • Both matrices yield essentially the same value 
for the smallest^ eigenvalue. Moreover, the smallest high-frequency 
eigenvalue of H L stays roughly constant — at about 1.45 — as N 
increases. Thus, the multi-grid condition number of this pre-conditioned 
Euler method increases slowly with N . 


The eigenvalue results given above suggest that an Euler iteration 
scheme using the approximate Lu factorization form of pre-conditioning will 
have the convergence rates listed in Table 6. The advantage of using 
multiple grids here is not as great as in the periodic case. The basic 

problem is the slow growth of the multi-grid condition number with N 
Clearly, better forms of pre-conditioning are needed. 


TABLE 6. Convergence Rates for Euler Iteration in Two-dimensions 


N 

single grid 
spectral radius 

multi-grid 
smoothing rate 

4 

0.264 

0.264 

8 

0.462 

0.330 

16 

0.605 

0.490 

32 

0.725 

0.630 


The interpolation for this multi-grid schemd can be based upon the 
Chebyshev polynomial expansions of the solution. Expressions analogous to 
equations (31) to (33) can be employed, where equation (45) is now used for 
the matrix C and the expression for the matrix E is altered accordingly. 
If the boundary conditions are homogeneous, then C can easily be 
manipulated into a self-adjoint form. 
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Non-s tat ionary Euler iteration will, of course. Improve the multi-grid 
smoothing rates. The use of 4 distinct parameters reduces the smoothing 
rates of the N = 16 and N =* 32 cases to 0.30 and 0.40 , respectively. 

Point-Jacobi is a viable alternative here as well. The present form of 
the matrix [eq. (49)] also permits the diagonal elements of variable 
coefficient (or non-linear) problems to be computed efficiently by transform 
methods. Two convolution sums now appear in the analog of equation (41). 
The portion Involving evaluated in the usual manner after 
allowing for special treatment of the terms for which j * 0 and j * N . 
The portion involving in transform space as the product of the 
transform of Mj and the complex conjugate of the transform of the variable 
coefficient term • 


Numerical Example 

The test problem for the Chebyshev multi-grid method has the 
coefficients 


a(x,y) = b(x,y) = 1 + £ (x^ + y^) (53) 

for the exact solution 

u(x,y) = sin('n‘cos x) sin(7Tcos y) • (54) 

Some of the results using the finest level K - 5 are listed in Table 7. On 

a single grid the residual for the G * 0*0 case is 8.39 (-1) after 15 
relaxations. The exact solution to the discrete equations for this case has 
an error that is essentially round-off error. There is relatively little 
content in the high-frequency component. The multi-grid approach to this 
problem makes its biggest gains by using the coarser grids to damp out the 
low-frequency con?>onents. 


TABLE 7. RMS Residuals for Chebyshev Spectral Multi-grid 
Using Stationary Euler Iteration 


relaxation 

number 

G — 0.0 

e = 0.1 

e = 0.2 

3 

1.25 (0) 

1.29 (0) 

1.32 (0) 

6 

2.14 (-1) 

1.89 (-1) 

1.67 (-1) 

9 

4.68 (-2) 

3.81 (-2) 

3.16 (-2) 

12 

1.18 (-2) 

9.14 (-3) 

7.34 (-3) 

15 

3.32 (-3) 

2.47 (-3) 

1.93 (-3) 
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An example similar to equation (54) was examined in reference 13, where 
two schemes were given for solving the constant coefficient Chebyshev 
equations exactly. The results of a recent note (ref. 14) suggest that 
greater accuracy can be achieved, especially on problems with singularities, 
by sub-dividing the original domain and patching the individual Chebyshev 
spectral solutions together along the internal boundaries. The spectral 
multi-grid method can be applied to patched collocation approximations as 
well. Moreover, the multi-grid approach would appear to present a noticeable 
improvement over the admittedly inefficient schemes used in reference 14. 


CONCLUSION 


The spectral multi-grid methods described here exhibited a substantial 
improvement over the simplest iterative schemes. It has not yet been checked 
whether this specific algorithm is more efficient than the best available 
iterative methods. There, of course, is still room for Improvement in the 
spectral multi-grid methods. This is especially true for the Chebyshev 
methods, for which better pre-conditioning procedures would help 
considerably. 

It is technically straight-forward to extend this solution technique to 
two-dimensional incompressible Navier-Stokes equations, particularly in the 
vorticity-streamf unction formulation, since the problem addressed in this 
paper is representative of the advect ion-diffusion equation. Present efforts 
are directed towards using the spectral multi-grid method to compute the 
classical problem of flow past a circular cylinder* The appropriate method 
for this geometry combines a Fourier approximation in angle and a Chebyshev 
approximation in radius. 
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APPLICATION OF MULTIGRID METHODS FOR INTEGRAL 
EQUATIONS TO TWO PROBLEMS FROM FLUID DYNAMICS. 

H. Schippers, 

Mathematisch Centrum, 1009 AB Amsterdam. 

INTRODUCTION 

Multigrid methods have been advocated by Brandt (ref.l) for solving 
sparse systems of equations that arise from discretization of partial differ- 
ential equations. Convergence and computational complexity of such multigrid 
techniques have been studied since. In reference 2 we have shown that these 
techniques can also be used advantageously for the non-sparse systems that 
occur in the numerical solution of Fredholm integral equations of the second 
kind 

(1) f = Kf + g, 

where g belongs to a Banach space X and the integral operator K is compact on 

X. Theoretical and numerical investigations show that multigrid methods give 

2 

the solution of (1) in (?(N ) operations as N ->- «>, whereas other iterative 
2 

schemes take 0(N log N) operations (N: the dimension of the finest grid). In 
practice this results in algorithms for the solution of these integral equa- 
tions that are significantly more efficient than the other schemes. In the 
present paper we apply multigrid methods to the following problems from fluid 
dynamics. 

CatcmXat'ion of potent'Lat flow around bodies - The total velocity poten- 
tial (J) is assumed to be the superposition of the potential <f>^ , due to a uni- 
form onset flow and a perturbation potential (p^, due to a doublet distribution 
at the body surface. This approach leads to a Fredholm equation of the second 
kind for the unknown doublet distribution. We introduce a multigrid method 
which makes use of a sequence of grids, that are generated by dividing the 
body surface into an increasing number of smaller and smaller panels. On these 
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grids the doublet distribution is assumed to be constant over each panel. For 
a two'-dimensional (2-D) aerofoil we have applied the multigrid method to the 
calculation of circulatory flow around Karman-Treff tz aerofoils. The use of 
multigrid techniques becomes more preferable for 3-D problems because the num- 
ber of panels is much larger than for 2-D ones. The calculations have been per- 
formed for the flow around an ellipsoid. From numerical investigations it fol- 
lows that ± 3 multigrid cycles are sufficient to obtain the approximate solu- 
tion. 


Calculation of oscillating disk flow - This application deals with the 
rotating flow due to an oscillating disk at an angular velocity 0, sin mx. The 
Navier-Stokes and continuity equations are reduced by means of the von Karman 
similarity transf ormations to 


(2) 

“ f = 


+ 2hf 

(N 

1 

Vi t 

2cl* zz 

z 


(3) 

0) 

^t ^ 

JL 

2to ^zz 

+ 2hg^ 

- 2fg 

(4) 






where (f,g,h) is a measure of the velocity vector in a cylindrical polar coor- 
dinate system (r,(f),z). For a single disk problem the boundary conditions are: 

(5) f = h = 0, g = sin tatz = 0;f = g = 0 for z 

In reference 3 the author has shown that the periodic solution: 

(6) h(z,0) = h(z,27r); g(z,0) = g(z,27r) 

can be obtained by implicit finite difference schemes taking the state of rest 
as an initial condition. The transient effects have been eliminated by calcu- 
lating a sufficiently large number of periods. Using the multigrid method we 
do not simulate the physical process, but reformulate the problem (2)-(6) as 

(7) (f,g,h) = K(f,g,h), 

where K is a non-linear integral operator. The multigrid method for integral 
equations is used to solve (7). For Q = 0.\ o) the computational work has been 
reduced by a factor O.I. 
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The present paper is based on parts of the Doctor's Thesis of the author 
prepared under the guidance of Prof. P. Wesseling of Delft University of 
Technology. 


CALCULATION OF POTENTIAL FLOW AROUND BODIES. 


For potential flow around a two- or three-dimensional body there exists 
a velocity potential 4> satisfying Laplace’s equation 

(8) A(j)= 0 

with boundary conditions, 


(9) = 0 along the boundary S, 

dn 

e 

where - — denotes differentiation in the direction of the outward normal to 
S and 


(10) (j)(C) for ~ , 

with (j)^ the velocity potential due to a uniform onset flow. If the flow is non- 

circulatory, we have with U the velocity vector of the undistur- 

°° .2.3 

bed flow. Here U*C denotes the usual innerproduct in H or in H . We repre- 
sent the velocity potential (|) as follows 




with (() , 
a 

( 11 ) 


the double layer potential given by 


1-m 


<j>d(?) = 


cos(n ,Z-C) 

y(z) Vt dS 

U-c ^ 


? I S , 


where m = 2,3 for the two- and three-dimensional case, respectively and n^ the 
outward normal to the boundary S at the point z. The doublet distribution p 
is such that (p satisfies the boundary condition 

( 12 ) <p~(c) = 0 , 
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where (j> denotes the limit from the inner side to S. Using the Plemelj-Privalov 
formulae (see reference 4) we obtain the following integral equation 

2-m f cos(n 

(13) y(?) + ^ J U(Z) 

s 

Assuming the boundary S to be sufficiently smooth it can be proven that the 
solution of the interior Dirichlet-problem (12) also satisfies the Neumann- 
problem (8)-(10) for the exterior of the boundary S. 


Calculation of Circulatory Flow around an Aerofoil. 


For circulatory flow around an aerofoil one must introduce a cut to make 
the velocity potential single valued. The Kutta condition of smooth flow at 
the trailing edge can be satisfied if we construct the cut from the trailing 
edge to infinity. 


S 



We denote the upper and lover side of the cut by and S , respectively. The 
contour composed of the aerofoil S and the cut is denoted by S + S + S^. Along 
the cut there exists a constant discontinuity in velocity potential. The jump 
is represented by a constant double layer potential with strength and y 
along and S , respectively. The difference y - y ^ is equal to the circula- 
tion which is taken positive in clockwise direction. 

We can represent the velocity potential by 


(|)(C) = u*^ + 


2n 


cos(n^,z-c) 
y(2) dS 

I Z-? 


S +S+S 


or rewritten 


(14) 


(f.(C) = + ((),(?) 


1 ^ - 
+ ^ (y -y 


)arg(z^-?) 
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where is defined by (11) with m = 2 and is the trailing edge. In this 
section we denote by arg(Zj/Z 2 ) with z^z^ e B. ^ the real value of the usual 
function defined by the complex numbers corresponding to z^ and Z 2 • The dou- 
blet strength along S follows from (12). So far we did not say anything about 
and y , but we still have to satisfy the Kutta condition. In the present 
paper we only consider aerofoils with non-zero trailing edge angle. For these 
cases the Kutta condition states that the flow speed must be zero at both 
sides of the trailing edge. Let and C be points at the upper and lower 
part of the trailing edge. The Kutta condition is satisfied if: 

f Dv(C^) 0 for I?'*’ - z^l ^ 0 , 


(15) 


[ Dy(c') 


0 


for k - Z, 


0 , 


where D denotes differentiation in the tangential direction. Application of 
conditions (12) and (15) to (14) yields the following integral equation 


(16) 

with 

(17) 


(I-K)u + e (y%“) = g. 


Ky(0 = 


-1 


cos(n ,Z-C) 

y(z) dS , 

U-d 


e(C) =— arg(z -c) , 

7T t 


g(0 = - 2 ti-^. 


Numerioat approach - The contour S is divided into N segments S. such 

N ^ th 

that S =.U^ and S^flS. = 0 , i^i^j. The begin- and end-points of the i 

segment are and z^ and are called nodal points. On this grid y is approx- 

imated by a piecewise constant function y^ and the resulting equation is 
solved by a collocation method. The collocation points , i = 1,2,..., N, 
are taken to be the mid-points of the segments By means of projection at 

the collocation points we get N equations. However, we have N + 2 unknowns 

^N.l’ ^N,N’ 4 4 = 

SO that we need two extra equations. Following condition (15) we replace y^ 
and yj^ by y^^^^ and y^^^ j , i.e. 
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(18) 




where and are the collocation points which are closest to the trailing 
edge at the lower and upper part of S. Let be the projection operator de- 
fined by piecewise constant interpolation at the collocation points. We have 
to solve the following equation 


(19) (I TjjK)pj^ + Tjj e Tjjg. 

In aerodynamics the above numerical approach is called a first order panel 
method. In reference 5 we have put it in a functional analytic framework. 
Assuming the contour S to be sufficiently smooth (except for a small region 
near the trailing edge) it was shown that a once continuously differentiable 
numerical solution can be obtained by a single iteration 


(20) \i^ g + * 

Furthermore, it was proven that the operator K is compact on the space of es- 
sentially bounded functions, provided the boundary is sufficiently smooth. 
Since aerofoils (inclusive the trailing edge) are not smooth this property of 
K does not hold for our application. 


Muttigrid method - The principal aim of this section is to show that 
equation (19) can be solved efficiently by a multigrid iterative process. In 
reference 2 we introduced multigrid methods for integral equation (1). The 
Jacob i-relaxat ion was used to smooth the high-frequency errors. Assuming the 
integral operator to be compact we were able to prove that the reduction fac- 
tors of these multigrid methods decrease as N increases. For our application 
this nice property is completely destroyed (see table !) because K is not 
compact. Problems with respect to the convergence of the iterative process 
arise in the neighbourhood of the trailing edge. Here the high-frequency er- 
rors are not removed by the Jacob i-relaxation: 
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(21) 


y 


(v+l) 

N 




T K 


- ■'n 



). 


Inspection of the matrix corresponding to T^KT^ reveals that the cross-diag- 
onal contains elements of magnitude k-1 + 0 (]/n) as with k = (exterior 

trailing edge angle )/7 t. This occurrence of off-diagonal elements of about the 
same size as diagonal elements explains why Jacobi-relaxation does not work 
well. Therefore we apply another relaxation scheme, which we call paired 
Gauss-Seidel relaxation* In order to explain this scheme we first rewrite (21) 
as follows: 


(v+l) 
N, 


N 

- I 

t=\ 




- for i = I (1) N. 


We obtain the paired Jacohi relaxation (PJ) scheme by removing the cross-diag- 
onal to the left-hand side: 


(v+l) , vv+i; ^ V 1 w; o f VV3 vv; . 

^N,i ^ij ^N,j ^i ^i^^N,N ^N,l 

for i = 1,2,..., N/2 and j = N+l-i. A similar expression is obtained for 
i=j. As a result we have to solve systems of equations of dimension 2. Sub- 
stituting the new values of ^ and ^ as soon as they are available we 
obtain the paired Gauss-Seidel (PGS) relaxation scheme. For i = 1,2,..., N/2 
and j = N + 1-i we define 


(v+l) _ 


N 


.(V) 


(v) 


.(V) 


= { 


for i < Z < j. 


Zi v+l for Z < i and £ > j . 

We solve simultaneously the following equations 

.(^+1) _ k, = 


and 


” . (v) (v). 


vv-f 1 ; . V 1 ^ / vv; Vv;. 

N,i "ij 


,(v+l) 


,(v+l) _ 


N 




(v_^i) 


+ y k u - ft ') 

j lij j£ ^N,£ ’^N,! ^ ’ 


,(v) 
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for i = 1 , 2 , . . • , N/2 and j = N + 1 - i , with v = v f or i = 1 and v = v + I 
1 < i < N/2 . The matrix elements k. . can be easily calculated. Let 


d> . . 

ij 


1 

= ^ arg 


z. 

J 






9 


then 


(j) . • for 
ij 

i¥ j 

9 


(J) . . + ‘ 

1 

if 

^11 

< 0 , 

^11 

-1 

if 

(p . . 

^11 

> 0 . 


for 


Let be a short notation for the space of piecewise constant func- 
tions of dimension N^. We introduce a sequence of ^spaces {X 

with N = 32 * 2^ such that 
P 


p|p 0,1,...,^} 


c X, 


c X, 


The corresponding projection operators are denoted by T^. In the context of 
multigrid iteration the subscript p is called level. 

The calculations have been performed for several Karman-Tref f tz aero- 
foils with thickness 6 = 0.05 and length £ = 1.0. These aerofoils are obtained 
from the circle in the X-plane, X = c e , by means of the mapping 


2 = f(X) = (X - X^.)V(X-c(fi-^Y))^ 


where y measures the camber and k the exterior trailing edge angle; 

c = 2£(6+(1 -y^)^)^'V (2(iV)^)^ , 

Xj. = C ((1-Y^) ^ - -c Y ) . 

Partition of the boundary on level P: Let the interval [0,27 t] be divided 
into N uniform segments with nodal points {9.|j = 0(1)N }. The nodal- and 
collocation-points in the Z- plane follow from f(ce j) and f(ce j+l)* res- 
pectively, 0*. I being the midpoint of subinterval [0.,0. The collocation 

J 2 J J "*" * 

points defined in this way are situated at the boundary and do not coincide 


200 



with the collocation points of the other levels. Therefore, the elements of 

the matrix , p = 0, 1 , . . . , corresponding to T^K have to be computed 

for all levels- Asymptotically for , the number of kernel evaluations is 

A 2 

~3 when the values are computed once and stored. We have taken the fol- 
lowing testcases: 

I. k = I - 90 and y = 0» 

II. k = 1.90 and y = sin 0.05, 

III. k = 1 .99 and Y = 0, 

IV. k = 1.99 and y = sin 0.05. 

The velocity U of the undisturbed flow is taken to be (cos t, sin t) with x. 
the angle of incidence. For the above testcases we give numerical results for 
T = 0 and T = 77/2. 

Algorithm: The approximate solution of (19) is obtained by the multigrid 
method defined in the ALGOL-68 program given in TEXT 1 : 

PBOC mulgv'ld - (WT VEC u^g) VOID: 

IF p - 0 

THEN solve d'ivectly (u^g) 

ELSE FOR i TO o 

DO relax (u^g); INT n = UPB u ; 

VEC resldu = g-u+K^*u-^^*(uLn^ -u [I]); 

VEC ton restrict (residu) ; 

‘mulgrid (p-2j um^ Sn ) J 
u := u + interpolate (wn); 
relax (u^g) 

OD 

FI 

TEXT 1 . Multigrid algorithm. 

Because of reasons of efficiency the number of coarse grid corrections 
(integer v) must be less than 4. For v = 1 and v = 2 we obtain the so-called 
V- and W-cycle, respectively. Here we choose v = 2. For the 3-D problem of 
flow around an ellipsoid we take v = 1 . The interaction between the grids is 
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defined by the procedures restrict and interpolate which are specified as fol- 
lows. Let n be the upper bound of VEC u ^ then: 

restrict (u) [i] 0*5 * (u + i = 

interpolate (u)[2'^i3 := interpolate (u) L2 i - 1~\ := u i - l(l)n* 

On level 0 the system of equations is solved by Gaussian elimination. For 

relax we take: Jacobi-, paired Jacobi- and paired Gauss-Seidel relaxation, 

respectively. We start our algorithm on level 0. The interpolation to level P 

(p > 1 ) of the approximate solution from level p-1 is used as initial guess of 

the multigrid process at level p; truncation occurs when the residual is less 
—6 

than 10 . Let VEC g denote the restriction of g to the collocation points of 

level p. In ALGOL-68 notation this algorithm reads: 

solve directly (u^^g^); 

FOR p TO 3 

DO := interpolate (Ug)j 

FOR i TO 25 WHILE residual > 10“^ 

DO mulgrid (p^ g^) OD; 

:= COFY u 

0 p 

OD; 

TEXT 2. Implementation of the full multi-grid algorithm. 

In the following table we compare the performance of the multigrid processes 
using various relaxation schemes. 

From this table we conclude that the multigrid method defined by Jacobi-relax- 
ation is not acceptable (it converges too slowly). The process defined by PGS- 
relaxation turns out to be the most efficient. Furthermore, we draw the follo- 
wing conclusions: 1. the number of iterations decreases as N increases and 2. 
on the highest level (N=256) only a few iterations are necessary. 
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TABLE 1 - NUMBER OF ITERATIONS 



J “ Jacobi , PJ - Paired Jacobi, PCS - Paired G^uss-Seidel . 


Calculation of Potential Flow around an Ellipsoid. 


The numerical approach to find the solution of (13) is connected with the 
shape of the kernel-function. Application of the collocation method in the 
space of piecewise constant functions leads to moment- integral s , which consist 
of the calculation of solid angles. We consider the ellipsoid defined by 


2 

+ y + 


2 

z 


1 , 


The velocity of the undisturbed flow is given by U = (1,0,0). The partition 

of the ellipsoid into panels is carried out as follows. First we divide the 

surface into N rings by planes orthogonal to the z-axis. Next each ring is 

divided into N trapeziform segments. The spherical caps are divided into N 

triangle-form segments. We denote these segments by S.. , i= 1(1)N and 
■k ^3 

j = 1(1) N . The collocation points are chosen to be the ^midpoints" of these 

segments and are situated at the surface. The solid angle subtended at C by 

S..with t S.. is given by 

=|-T ^ ° 
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cos(n^,z-c) 


I 2-C 

s. . 

IJ 


2 


dS 


Z ■ 


In contrast with 2-D in general these integrals cannot be evaluated analytical- 
ly. We approximate by one or two flat planes. The solid angles subtended 

by such planes can be evaluated analytically. 


Muttigrid method - The different grids are related by =4*2^ and 

N* = 4 * 2^ . Putting 3 = 0 we use the algorithm given in TEXT 1 with v = 1 . 

P P 

Analogously to 2-D we define the procedures soZve directZy^ restrict and inter- 
poZate by Gaussian elimination, weighted injection and piecewise constant in- 
terpolation, respectively. For reZax we take the Jacobi-relaxation scheme. 
Assuming the surface to be smooth Wolff (ref. 6) has analysed this multigrid 
method. He has proven that the reduction-factor of the multigrid process is 
less than ch for h 0, where h and a are a measure for the mesh-size and 
the smoothness of the surface, respectively. For the ellipsoid a = 1 . 


NumericaZ resuZts - In table 2 we give the residuals and the observed 
reduction factors 


= II 


/i+l) _ ,(i) 11/ II ,(i) _ /i-1) 

N N N N 


with II . II the supremum norm. We also give the mean reduction factor 


n 


k 

{ n 

i=i 


n- } 
1 


l/k 


and the operation count expressed in work-units. One work-unit is defined by 

★ 2 

(total number of mult iplications) /(N^ * ) with t the highest level. We only 

take into account matrix-vector multiplications and the direct solution on 

1 * 3 

the coarsest grid for which we count ) multiplications. Table 2 en- 

ables us to draw the following conclusions: 1. Comparing the results obtained 
with Z - 2 and ^ = 3 we see that the mean reduction factor of the multigrid 
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TABLE 2 - POTENTIAL FLOW AROUND AN ELLIPSOID * 





J^fULTIGRID METHOD 


4, N, 

II 


z 

= 2 ; 

N 2 = 

16.N* 

= 16 

z = 

3 : >»3 

= 32 

* 

. N 3 = 

32 

iter 

residual 

red. factor 

iter 

residual 

red. 

factor 

1 

1.17 

lO”' 



1 

4.56 

io"“ 



2 

2.04 

io“^ 

4. 13 

10 -^ 

2 

4.38 

lO"''^ 

1.67 

10-2 

3 

7.75 

lo'^ 

1 .40 

io“^ 

3 

8.48 

10-6 

6.98 

10 -^ 

4 

1.89 

10-6 

4.63 

10 -^ 

4 

9.93 

10-6 

2.56 

io“^ 

5 

6.54 

10-6 

2.36 

IQ-^ 






mean red. factor: 

2.83 lO' 

-2 

mean 

red. factor: 

1.44 10 

-2 

operation count : 

10.68 


operation count : 

8.53 


JACOBI ITERATIVE PROCESS 

N 

= 16 , 

* 

N = 

16 


N = 

32 , 

N* = 

32 


iter 

residual 

red. factor 

iter 

residual 

red, factor 

1 

1.73 

io“* 


io“' 

1 

2.15 




2 

8.05 

4.51 

2 

1.20 


5.44 

10 ' 

3 

3.82 

10~^ 

4.68 

io“’ 

3 

6.72 

10-' 

5.57 

10"' 

4 

1.83 

io“* 

4.75 

10-’ 

4 

3.79 

10~' 

5.62 

io“' 

5 

8.75 

io“^ 

4.78 

10-’ 

5 

2.14 

10-' 

5.64 

10-' 

6 

4.20 


4.79 

10-' 

6 

1.21 

10-' 

5.65 

10-' 

7 

2.01 

lo"^ 

4.80 

io“’ 

7 

6.85 

10-^ 

5.65 

10-' 

m 

• 

• 



• 

• 

• 

• 


8 

. 

3.88 

10-^ 

5.66 

• 

• 

10-' 

• 

• 

• 

• 

21 

6.94 

10-7 

• 

• 

• 

• 

4.80 

10-' 

27 

7.79 

10~7 

• 

• 

• 

• 

5.66 

10-' 

mean red. factor: 

4.77 lO' 

-1 

mean. 

red. factor: 

5.64 

10-' 

operation count : 

21 


. 

operation count : 

27 



V 2 2 

*^ + y+z = 1 , U parallel to the x - axis. 
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nvethod has been decreased by a factor 2, which is in agreement with the theo- 
retical results of Wolff (ref ,6) and 2. the multigrid method is much cheaper 
than the Jacobi-iterative process. 


CALCULATION OF OSCILLATING DISK FLOW. 


The rotating flow due to an infinite disk performing torsional oscilla- 
tions at an angular velocity Q sin w x in a viscous fluid otherwise at rest 

involves two relevant length scales : 1 . the Von Karman layer thickness 

1 /2 

(v/Q) , where v is the kinematic viscosity and 2. the Stokes layer thickness 

1 /2 

(v/co) . By means of the Von Karman similarity transformations the velocities 
(u,v,w) in a cylindrical coordinate system (r,<(),x) can be written as: 


1 /2 

u = ^2rf(z,t) , V = ^rg(z,t) ,w = - 2(2vo)) h(z,t). 


^ \ l2 

where z = x and t = cox . In that case the Navier-Stokes equations re- 

duce to the partial differential equations (2) -(4). Apparently the oscillat- 
ing disk flow is characterized by the parameter e = which determines the 

ratio of the Stokes layer thickness to the Von Karman layer thickness. 

For the high-frequency flow (e< <1) analytical solutions are found in 
the literature in the form of series expansions in terms of e. This type of 
flow consists of an oscillatory inner layer (i.e. Stokes layer) near the ro- 
tating disk and a secondary outer layer (i.e. Von Karman layer). Using a 
multiple scaling technique Benney (ref. 7) was able to find series expansions 
valid throughout the region of flow. The first order terms of the solution 
are given by: 


( 22 ) 


g(z,t) = e sin(t-z/e) 


f(z,t)~ee for z^«> 
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with a = 0.265. In reference 3 we used this technique to determine the axial 

3 

inflow at infinity up to the term with e : 

(23) h(«,0) = a e + {ab + O } y 

with b = - 0.207 . Inspection of (22) reveals that problem (2) - (6) is singu- 
larly perturbed and for a fixed t the solution contains more and more high 
frequency components as e 0. 

In this paper we discuss two computational methods to find the periodic 
solution satisfying (6). The first method is based on simulation of the physi- 
cal process by taking the state of rest as an initial condition and elimina- 
ting the transient effects by integration in time. In mathematical terminology 
this process can be interpreted as Picard’s method for computing a fixed point. 
Let the velocity vector be: 

V = (f,g,h). 

Denote by (u(z,t); v^) the solution of the usual initial-value problem (2)- 
(5) with initial data: 

(24) v(z,0) = Vq(z) . 

Assume that the initial data belong to a suitable class L, Define a map of 
L into itself by the equation 

(25) ^ ’ 

being the solution of (2) - (5) and (24) at t = 2tt. Since (2)-(4) is a parabolic 
system may be expected to have a smoothing influence, just as the integral 
operators of the Fredholm equations studied in reference 2. In operator nota- 
tion simulation of the physical process is written as the Picard sequence 

(26) - K (u.) with = 0. 

1+1 e 1^ 0 

The periodic condition (6) rewrites as 


207 



( 27 ) 


i; = K (u) , V e L. 
e 

We remark that is a non-linear operator* For e < 1 (26) converges slowly* 

Therefore we have devised another method. Since equation (27) has a superfi- 
cial resemblance with a Fredholm equation of the second kind we have applied 
a multigrid method to (27). 

Numerical Approach 

This section is divided into two parts: 1. the numerical solution of the 
initial-boundary value problem (2) - (5) with the initial data (24) and 2. nu- 
merical methods for finding periodic solutions satisfying (6). 

Discretization of the initial-boundary value ’problem - Consider the par- 
tial differential equations (2) -(4) with the boundary conditions (5) and the 
initial data (24). To this problem we apply implicit finite difference tech- 
niques in combination with an appropriate stretching function for the construc- 
tion of the computational grid. In calculations the boundary conditions at 
infinity are applied at a finite value z - Z : 

(28) f(£,t) = gU.t) = 0. 

We want to resolve the flow structure near the disk with a limited number of 
mesh points. Therefore, taking into account (22) we transform the z-coordinate 
by: 

(29) z(x) = £(ex + ( l-e)x^) , xe[0,l], 

and we take the mesh covering of the new range 0<x< 1 uniform with stepsize 
Ax = Vn. Integration in time is done by the Euler-backward formula: 

%r . 1 . I ^ 

gt = ,with At = 2 tt/T . 

The right-hand sides of (2) - (3) are discretized by central differences at 
t = . The left- and right- hand side of (4) are integrated by means of 

the mid-point and trapezoidal rule, respectively. The resulting non-linear 
system of finite difference equations is solved by means of Newton iteration. 
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which is terminated if the residual is less than 10 
see reference 3. 


“6 


For further details 


J^umer'icdl methods for computing periodic solutions - Using the above finite 
difference approach we define the discrete counterpart of the operator and 
the velocity vector ^ ^ ^ ^ respectively. In discrete operator 

notation the periodic condition reads: 

(30) N,T,£ 

In the present paper we propose two computational methods to solve (30) : A. 

simulation of the physical process by Picard iteration and R. a multigrid 

method. In the first method the parameters e, N, T and £ are fixed. In the 

second method the parameters N and T are taken from a sequence {(N^,T^)}, 

p = 0, 1 , . . . , L such that with p = L we have = N, = T and with p <q <L 

we have N < N , T < T (i.e. a smaller p corresponds with a coarser discre- 
P q P q 
tization) . 

A. Simulation of the physical process: We take the state of rest 
= 0 ) as an initial condition. The transient effects are eliminated by 
Picard's method: 


(31) 


V.; ■' = K 


,(i+l) ^ 

N ''e;N,T,£ '"N 


The iteration index i counts the number of periods that is calculated. This 
process is truncated if the residual II ~ T £ less than 
0.5 Here 


I u II = max I g . I + max | h . 
^ 0<jsN J 0<j<N ^ 


,P 


B. Multigrid method: We introduce a sequence of grids with N = 20 * 2 

j P - ^ 

and Tp = 8*2 . The integer p is called level. We replace the subscript by p* 

u., = V and fC „ _ o = K 

N p e;N,T,-t e;p 

p ^ p p 


Denote the velocity at grid point x. on level p by u [j] = (f.,g.,h.). The 

J P J I J 
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addition V Cj] + V [k] and the multiplication c * v [j] are defined as usu- 
P P P 

al (element by element). The interaction between the grids is defined by piece- 

wise~linear interpolation: 

interpolate (U) [j] - { 0 , 5 ^, (y JiL] » - I Is!. . . , 2B-1 . 

and by injection: 

restrict (U) C j] = f/ [2 j ] , j = 0, 1 , . . . , ~ , 

where N is the upper-bound of the velocity vector tl. 

We use a multigrid method that starts on level 0 with simulation of the 
physical process (method A). For small values of c we apply continuation. Sup- 
pose we have the following e-sequence 1 ^ ej >....> e^ with = 1} . At 

each stage of this continuation process we approximately solve the equation 

-3 

Vq = . Q (Uq) by (31) until the residual is less than 0.5 10 .As initial 

guess of (31) we take the solution of the previous stage ^ ^ ^0 

we take the state of rest. Denote the solution of this continuation method by 

% ^ ^O’^l 

Since (30) is a non-linear equation it is only solved approximately. Let 

be an approximation to the solution of (30) on level p. We define the 

defect of U by 
P 

d = U - K (U ) . 

P P e;p P 

The multigrid method is given by the ALGOL -68 program in TEXT 3, where VELO 
is a mode for the vector of unknowns: 

MODE VELO = STRUCT {VEC f ,g, h) . 


PROC compute periodic solution - { ^ to level ^ lET Z) VOID: 

'1 


(U^ ; u^(e^ , e ^ 5 . . . , c ) 


0 * '^O^ 0’^l’”’’“m' 

FOR d TO I 

interpolate 

mult'Lgr'Ld (Jj IjLl .^0 •) 
3 3 


OD 

); 
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mOC multigvid = (Jii/y m, a, REF VELO VELO y) VOID: 

{IF m = 0 

THEN FOR k TO 50 WHILE residual > S 

e 

DO VELO K = y - U K (U) ; 
ves'Cdual 11/lII ; 

U := U + 0)^ * ^ 

OD 

ELSE FOR i TO a 

DO U := y + K (U); 

VELO d-d ^-‘restrict (y-U +K C^-O); 

VELO V := COPY U - 
m-1; 

multYgrid (jn-1^2^ Uj d) ; 

U := U + interpolate “ ^) 

OD 

FI 

); 

TEXT 3 Multigrid algorithm for the computation of periodic solutions of 
parabolic equations. 

The structure of this multigrid algorithm has been proposed by Hackbusch 
(ref. 8) for the numerical solution of general time-periodic parabolic prob- 
lems. Here we apply it to the particular problem of oscillating disk flow. 

On level 0 of multigrid we use overrelaxation for extremely small val- 
ues of G. The parameter takes the values 1,2 and 4. Initially we put 

= 1. If the axial inflow converges slowly it is multiplied by a factor 2. 

As soon as the residual increases the value cu, = 1 is restored. 

k 

Eumerioal results - From Zandbergen and Dijkstra (ref. 9) it is known 
that Von Karman^s rotating disk solution can be represented sufficiently ac- 
curate with L - 12, hence we fix infinity at this value. We give numerical 
results for the following values of z : 

Gq = 1 , = 0.5 y = 0.05. 
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This sequence is also applied in the continuation process that is used to 
find an approximation of the multigrid method, e.g. for e = 0, 1 we have 
Uq 0*5, 0.1). For N = 160 and T = 64 we compare the performance 

of simulation of the physical process (method A) and the multigrid method (B). 
On the coarsest grid the latter method needs 20 stepsizes in space and 8 step- 
sizes in time; hence it uses four levels: 0,1, 2 and 3. 

Let a work unit be defined by the computational work needed for cal- 
culating one Picard iterate with N = 160 and T = 64. In table 3 we compare 
the computed axial inflow at infinity with the value of its asymptotic ap- 
proximation (23) for £->0. Between parentheses we give the number of work 
units and the iteration error II N T £ where is the final 

solution. 

On level 0 of the multigrid method we used Picard iteration (i.e. 03^^= l) 

for e^O.l. The iterative process was terminated when the residual was less 
-4 

than 6 =0.5 10 . For e = 0.05 we have applied overrelaxation (1< o), < 4) 
and we have put 6^ = 10 . That is the reason why the computational work 

increased for this case. 

TABLE 3 AXIAL INFLOW* 


e 

method A 

method B 

(23) 

1.0 

0.2014 
( 8. 4.4 

10~^) 

0.2014 
(6.8, 9.3 

10 

0.2360 

0.5 

0.1177 
(17, 4.7 

lo"^) 

0.1178 
(7.0, 3.9 

10-^) 

0.1253 

0. 1 

0.0236 
(74, 4.9 

io‘^ 

0.0271 
(7.4, 1.6 

10~^) 

0.0262 

0.05 

0.0083 
(72, 4.9 

10-^) 

0.0137 
(12.5, 3.3 

lo"^) 

0.0132 


Between parentheses : number of work units, residual. 


From table 3 we conclude that the multigrid method becomes more effi- 
cient as e decreases. For e = 0. 1 the computational work has been reduced by 
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a factor 1/10. For e = 0. 1 and e = 0.05 the numerical results of method A 
still contain a low-frequency error. In this case the test for termination 
of the physical process is not adequate. The process converges slowly, as 
can be seen from figure 1, in which we have displayed the axial inflow as 
a function of the number of periods. For e = 0.05 the axial inflow is still 
increasing after 72 periods. The same phenomenon occurs on the coarsest grid 
of the multigrid method. Therefore we have applied overrelaxation. 



k (= number of periods) 

FIGURE 1, Dependence of the axial inflow on the number of periods 

The results of our analysis are given in figures 2-3. The profiles of 

the variables f/e , g and h/e are displayed in figure 2. We see that there 

is an oscillatory boundary layer. For smaller values of e (see figures 2 

(c-d)) the azimuthal component of velocity (g) is confined to this boundary 

layer and the radial and axial component of velocity (resp. f and h) persist 

outside this layer. The results for the quantities e g (0,t), f (0,t) and 

z z 

h(“,t)/e are displayed in figure 3. Comparing these figures we see that the 
fluctuations in h(«>,t) decrease as e 0. This means that outside the boun- 
dary layer the fluid motion becomes stationary (i.e. the outer flow does 
not depend on t) . These numerical results are in agreement with the analyt- 
ical solutions of Benney (ref. 7). 
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Finally, from the results just presented we conclude that for the com- 
putation of periodic solutions of the single disk problem for e < 1 the multi- 
grid method is preferable, whereas for e > 1 simulation of the physical pro- 
cess may be employed. 
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GENERAL RELAXATION SCHEMES IN MULTIGRID ALGORITHMS 


FOR HIGHER ORDER SINGULARITY METHODS 

B. Oskam and J.M.J. Fray- 
National Aerospace Laboratory, NLR 

SUMMARY 

This paper describes relaxation schemes based on an approximate and 
incomplete factorization technique (AF) . These AF schemes allow one to con- 
struct a fast multigrid method for solving integral equations of the second 
as well as integral equations of the first kind. Novel items are the smooth- 
ing factors found for integral equations of the first kindy and the comparison 
with similar results for equations of the second kind. Application of the MG 
algorithm shows convergence to the level of the truncation error, of a second 
order accurate panel method, within 2 multigrid cycles. 

INTRODUCTION 

Most effort going into the application of multigrid techniques seems to be 
directed to solving the sparse systems of difference equations associated 
with partial differential equations. However, the multigrid technique can also 
be used advantageously to solve the nonsparse systems of equations that arise 
from integral equations, as shown in references 1 and 2, 

In the present paper we study the application of multigrid techniques to 
the solution of integral equations associated with potential flow problems. 

This effort fits into the larger framework of the development, at NLR, of a 
next generation singularity or ’^panel” method. A question associated with this 
development is whether singularity methods do have a future, particularly in 
view of the current progress in finite difference methods. Reference 3 contains 
several arguments for a positive answer to this question, but at the same time 
presents the rather stringent requirement of high computational efficiency. 

The scope of the present investigation is limited to the analysis of multigrid 
(mg) techniques and the subsequent application to some model prob].ems in two 
dimensions. Various relaxation schemes, which are used as smoothing operators 
in multigridding, are evaluated. For some particular geometries, such as an 
unbounded flat plate and a circular cylinder, this smoothing problem is ana- 
lyzed by the local mode analysis of reference U. For more complicated geome- 
tries, such as an airfoil, it is found that the finite-dimensional, discrete 
Foirrier transform can be used to define a global smoothing factor which repre- 
sents an upper boimd of the actual convergence factor of the high frequency 
components of the residual vector. A general multigrid algorithm is described 
and applied to solve the potential flow problem of multicomponent airfoils. 

Before starting the discussion of the integral equations it is important 
to realize that the asymptotic operation counts remain of the order of n^ if 
nothing is done to reduce the work associated with the residue evaluations 
which involve a full matrix times vector multiplication. Multigrid methods to 
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lower the computational work involved with these residue evaluations are cur- 
rently being studied at NLR, see reference 3. The basic concept is to lower 
the asymptotic operation counts by treating the far field connections on a 
sequence of coarser grids without compromizing the truncation error. These 
aspects of a next generation panel method are however outside the scope of the 
present paper. 


INTEGRAL EQUATIONS 


Most panel methods use the boimdary condition of zero-normal -velocity on 
the siirface of the contour to derive an integral equation for a distribution 
of surface sigularity, source or doublet, over the body surface. Let us denote 
the source and doublet strength by a and y, respectively, and let and x 
be the positions of the points p and q. The normal velocity at the point p^ 
induced by distributions of these singularities may be represented as 


n 





(Ini 


pq.' 



( 1 ) 


and 








. X 


) ^ 
1 -^ 9n 


a_ 

9n 


(In 


pq 


)ds 


( 2 ) 


where r_^ = x_-x„ and rin is the outward normal, and direction of the doublet 
pq p q H . 

axis, at the point q. The normal at Xp is denoted by np. 

The integration variable s is the distance measured along the contour. 


Attention is directed to two particular panel methods which may be formu- 
lated by employing eqs. (l) and/or ( 2 ), The first is the surface source method 
having an unknown source distribution on the body surface and an auxiliary dou- 
blet distribution of known shape but unknown magnitude, also on the body sur- 
face, to produce the lift, see reference 5- The second panel method considered 
employs equation (2) only and is called the doublet method, e.g. see reference 
6. The reason these two methods have been employed in the present paper is 
that they produce quite different integral equations, being of the first and 
second kind for the doublet and source method respectively. 


To facilitate the discussion of various discretization schemes we rewrite 
equation (2) for the particular case of an unbounded flat plate as 


d 

V 

n 


(x) 


+ oo 





+00 





+00 



— oo 


( 3 ) 


where x, 5 is the distance measured along the plate. 
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DISCRETIZATION OF INTEGRAL EQS 


The aerodynamic influence coefficients are evaluated using a consistent, 
small curvature expansion of the integrals that remain after discretization, 
see Hess (ref. 7). Specifically, the profile curve that defines a two-dimen- 
sional body is approximated by a piecewise quadratic representation and the 
source, doublet distributions are approximated by piecewise linear, quadratic 
representations, respectively. These choices result in aerodynamic influence 
coefficients (AIC's) of second order accuracy in h, where h is the panel size. 


Let the doublet representation for the case of an unbounded flat plate 
be given by 

' ' )^ W 

j_ 

For the purpose of studying the dependency of the smoothing factor on the 
discretization scheme the derivatives in equation (U) have been discretized by 
3~point differences 


' "i ^ (4) ? 

^ 1 ' . 


(dt).= and (^) = 


2 


( 5 ) 


and by 5-point differences, resulting from a continuity requirement of y across 
panel edges , 


(d?).= ^-^i+2 ^ 


and 


2 

(H) = 


(6a) 


(6t) 


where y i is the value of the doublet representation y at C which is the 
midpoint of panel with index i. For the case of the flat plate all panels have 
^qual size h. The difference between the 3 -point and 5 -point representations, 
y^ ^ and y^ ^ respectively, turns out to be 

~ ~ /r r ^ ^ / 'd^A ^ n2 / 7 ^ 

- ^-5-v 8 8 V Vi 


which is of the same order in h as higher order terms neglected in equation 
(U). Thus the 3 - and 5 -point differences both result in a doublet representa- 
tion of third order accuracy in h. Both representations have sufficient conti- 
nuity at the panel edges such that the contributions of the second and third 
integral in equation (3) may be neglected, being not larger than the basic 
truncation error of the first integral in eq. (3). 
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Evaluating eq. (3) at panel control points, after substituting eq. 
results in a system of algebraic equations 

+00 

\ <"^i> = art J “k+i’ i = "•••• “• 

vhere 

a = k In 1 1+32 k/(2k-3)(2k+l)^|+ ln| 1-8/(4k^-l) I , 
k ^ 

or 


= -^ ln| 1+l6/(4 k^-25) I + In | i-8/( 4k^-1 ) | 

+ •^ k Inl 1-8k/(4k^+k-15) 1 +2 k ln| 1+8 k/(4k^-4k-3) | 


( 4 ), 

( 8 ) 

(9) 


+ I k ln| 1 - 2 /( 2 k+l)| . ( 10 ) 

Equation ( 9 ) represents the AIC's resulting from the 3-point differences in 
eq. ( 5 ) and the AIC’s of eq. (IO) correspond to 5-point differences (eq. ( 6 b)). 


For the case of a curved contour, such as an airfoil, we need a small 
curvature expansion of the integrals as mentioned before. Moreover we will 
take a nonuniform panel distribution. The resulting expressions of the AIC*s 
will not be presented here for the sake of brevity. However it should be 

mentioned that the AIC’s of the doublet distributions are based on a third 

. ^ . . . . 

order accurate representation y, requiring continuity of y across panels 
edges, which involves a generalization of the 5 -point differences in equation 
( 6 ) to nonuniform panels. 


The resultant linear system of algebraic equations may be written as 

i=1 ,2, . . , , (n+n^) , ( 1 1 ) 

where n is the total number of surface panels and n^ the number of components 
of a multicomponent airfoil. The unknown parameters uj for j = '1 , . , . , (n+n^ ) 
denote aj(j =1 ,2, . . . ,n) , cj(j =1,.,.,n^) for the source method, where a. is 
the value of the source representation^at control point xj and cj the ^ 
magnitude of the auxiliary doublet distributions of component j . In case of 
the doublet method Uj ( j = 1,2,..., n+n ) denotes Pj(j =1,2,...,n), 
yj(j =^,...,nc), where yj is the value of the doublet representation at control 
point xj and yj the value of the doublet representation at the endpoint of the 
integration interval of component j. The value of the doublet distribution at 
the beginning of each integration interval is equated to zero without any loss 
of generality. The ordering of the equations (II) is such that the diagonal 
elements aii of the first n equations express the influence of a parameter u- 
at the control point xi . The last n^ equations of system ( 1 1 ) express the 
Kutta conditions at the trailing edges of the airfoil components j ( j=1 , . . • ,n^ ) . 


n+n 


2. 

j=i 


a. . u . = f . , 

ij J 1 
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An example solution of the source method applied to a 12 percent thick 
von K^rm^n-Trefftz airfoil with a trailing edge angle of 15 degrees (KT0012) 
is shown in figure 1 . Second order convergence of the n-dimensional vector 
norms of the error in the tangential velocity component of this solution is 
found, see figure 1. 


RELAXATION SCHEMES 

The relaxation schemes, also called smoothing operators in MG algorithms, 
exploit^the behavior of the kernels of equations (1) and (2), being like 1 /r 
and 1/r , respectively, where r denotes the distance. This behavior tells us 
that the high frequency components of the singularity distribution have a 
short coupling range. Neglecting the far field connections between parameters 
and control points should, therefore, be a sound basis for constructing effec- 
tive smoothing operators. 

On the basis of this consideration we will present two basic classes of 
relaxation schemes. The first class of schemes is based on incomplete LU 
factorization (ref. 8) of the approximate system of linear equations that 
remains after omitting the far field connections, resulting in an approximate 
factorization (AF). The factors L and U of the LU factorization are forced to 
have an extensive zero pattern by omitting the nonzero entries which may arise 
outside of the intended nonzero pattern in the factors L and U during factor- 
ization. The present AT scheme is different from the incomplete factorization 
of algebraic equations associated with the discretization of partial differen- 
tial equations because there is no need to omit any far field connections in 
the latter. Moreover the extensive zero pattern in the lower and upper trian- 
gular factors need not be the same as the zero pattern of the approximate 
system of linear equations that remains after omitting the far field connec- 
tions, although we have chosen these two patterns identical in the present 
examples . 

A second class of relaxation schemes is based on the direct construction 
of a sparse, approximate inverse. We may constinict such an inverse if we 
approximately satisfy each individual equation of system (ll) in its turn by 
directly solving a very small system of equations, comprising a subset of the 
entries of the full system, for every unknown parameter. These small systems 
should be chosen such that they include the coupling range of high frequencies. 
Thus we relax each equation individually, distributing changes to its neigh- 
boring parameters. This second class, which we will call natural relaxation 
schemes (NRS), is also a general technique. An example of this technique is 
given in the next section. 


FOURIER ANALYSIS 


-n -1 
a 


Let a relaxation scheme based on approximate factorization of equation 
(8) be defined by 

na 

^ ^ Vk+i - ^ Vk+i‘ 

a a 
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where the superscript v is the iteration index and f. the right hand side 
which is given. The pattern of far field connections which are neglected in 
the approximate equation on the left hand side of equation (12) is denoted 
by the integers k satisfying jk| > n^- This zero pattern also applies to the 
factorization. The convergence factor p of the 6 component, defined in refer- 
ence of the error in the solution during the iteration procedure (12) is 
found to be <» 

Z ^a 
a. cos (k6 ) I / \ '* 

k=1+n^ / |a^ + 2 /_^ cos (ke ) | , (13) 

k=1 


where the second summation term is to be omitted for n^^ = 0. This convergence 
factor as function of the frequency 0 is shown in figures 2 and 3 for n =0,1, 
2 and U, It is seen that the convergence of the high frequencies, i.e. 

0 > ^, is of the order of 10-2 for n ^1. 

The ^econd relaxation scheme (NRS) for equation (8) is based on a sparse 
inverse, a^, which is defined by 

= 0 for |k| > n^ and ^*or |k| < n^ (lU) 


where g, is the solution of 


n 




( 15 ) 




with k=| j-i| and 6^0 = Kronecker delta. Applying this approximate inverse 
in a residual correction iteration process, see appendix, results in an error 
amplification matrix given by I-AA. The matrix AA, denoted by B, is an infi- 
nite, symmetric Toeplitz matrix because A and A are infinite, symmetric 
Toeplitz matrices. This observation allows one to obtain the convergence fac- 
tor implied by this NRS scheme, similarly to equation (13). One finds: 


p(6) = I l-lD. 


-22 


k=1 


b, cos 
k 


(k0 ) 


(16) 


where bj^ are the elements of B = AA. It may be verified that equations (13) 
and (l6) are identical for n =0. The local smoothing factor p is defined in 
reference 4 by 

<'T) 

It is a significant measure by which the relative merits of equations (13) 

and (l6) may be judged for na > 1 . Values of p for n^ = 0, 1, 2, 4 and 7 are 

given in table 1, for both the 3- and 5-point difference schemes. It may be 

seen that the smoothing factor of the AF scheme is considerably better than 

that of the NRS scheme. Comparing the 3- and 5-point differences shows that 

in case of the AF scheme the 5-point differences result in a lower smoothing 

factor for n > 1 . 

a 
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Applying the source or doublet method to the parallel flow around a cir- 
cular cylinder (no lift) results in a symmetric, circulant matrix eq. (11), 
which is denoted by C]r (k=0, 1, 2 ,..., (n-l)), provided we use uniform paneling. 
The convergence factor of the errors in the solution during the iteration 
procedure (12) applied to these circulants is found to be 


p(e.) 


■n/2 


n/2-1 

k=1+na 


cos 



a 

2 Zj 


k=1 


cos 


(k0.)|, (18) 


where the discrete frequencies 9. extend over 27ri/n, i=0, 1,..., n/2. This 
equation (l8) turns out to be identical with equation (13) in the limit of 
n oo. The smoothing factors obtained from eq. (18) for the doublet method, 
as shown in table 1, reflect this observation. For the source method the 
smoothing factors obtained from eq. (18) are given in table 2. These factors 
tend to zero in the limit of n going to infinity, which is characteristic for 
"MG algorithms of the second kind". 


Although the results obtained above do give valuable insight in the 
smoothing properties of relaxation schemes, the local mode analysis cannot 
take the effects of such practical things as surface slope discontinuities 
and/or nonuniform paneling of the surface into account. A n-dimensional , 
discrete Fourier transform of the residue amplification matrix I-AA, see 
appendix, given by 

G = F (I-AA) F"'\ (19a) 

/P 

F = f^^ "v/n [i 2 7T kl/n], k,l = 0, 1 , . . . , (n-1 ) , ( 19b) 


F 


-1 


F (the complex conjugate). 


( 19c) 


^s more suitable to study these aspects of the smoothing problem. The matrix 
A in equation (19a) may either be an actual inverse (NRS) or the implied in- 
verse of an AF scheme. Let the row sum of G be defined by 

n-1 

K = I I k = 0, (n-1), (20) 

^ 1=0 


This row sum can be shown to be an upper bound of the convergence factor of 
the component of the residue vector in a residual correction iteration 
process, where 0^ = 27rk/n occupies the unique part of the frequency range for 
k=0,l, ...,n/2. These considerations allow us to define a global smoothing 
factor X by 

A = max JK\^ 

n/4 < k n/2 \ 

analogous to the local smoothing factor (eq. (17)). It should be noted how- 
ever that this global smoothing factor is only an upper bound of the conver- 
gence of the high-frequency components because the transformation in eqs. (I9b) 
and (I9c) results in a matrix G which is not diagonal, the off-diagonal ele- 
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merits representing the coupling of differing frequencies. 

The global smoothing factor of the AF scheme applied to the source and 
doublet method for the KT0012 profile (see fig. l) has been determined using 
a Fast Fourier Transform algorithm. The particular AF scheme used is charac- 
terized as before by the far field connections omitted from equation ( 1 1 ) and 
the subsequent zero pattern forced onto the incomplete LU factorization of the 
resultant sparse matrix. These two sets, the far field connections a. . and the 
zero pattern, are defined by the pairs of integers (i,j) satisfying 

I i-j I > n and |i+j-n-l| > n and i ^ n and j n. (22) 

a a 

Table 3 gives the computed global smoothing factors for n = 32, 6U, 128 and 256 
and for n^ = 0, 1, 2 and 4, From this table the following conclusions are 
drawn. The smoothing improves as the dimension of the nonzero pattern na is 
increased. There is no qualitative difference between the source and doublet 
method, the smoothing factors being approximately independent of the number of 
panels. This is expected of the doublet method, but the source method results 
are qualitatively different from those of table 2. Numerical experiments sug- 
gest that this qualitative difference is a direct result of the surface slope 
discontinuity at the trailing edge. 

The similarity between the source and doublet method may also be observed 
from the results plotted in figures 4 and 5 where the row sum is shown as 
function of frequency, for ng^ = 0 and 1. In case of n^ = 1 a typical smoothing 
character is observed, i.e. the convergence bound decreases with increasing 
frequency. 


MULTI GRID ALGORITHM 


The multigrid algorithm is described by the following quasi-FORTRAN T7 
program, see also reference 9^ 

9 Z 

SUBROUTINE MG (i, u', i , p, m, q) 

INTEGER p, q 
it ( t) = i ^ 

1 IF (k. EQ.1) GOTO 4 

2 CALL SMOOTHING (r , u , p) 
rk-1 = RESTRICTION (r^) 
k=k-1 ^ u^=0 $ it(k)=m 
GOTO 1 

k CALL DIRECTSOLVER {r , u^) 

5 IF (k.EQ.)!’) RETURN 

k=k+ 1 , . 

= PROLONGATION (u ) 

A^ 6u^ 

CALL SMOOTHING (r , u^, q) 

it (k) = it (k) -1 

IF (it (k). EQ. zero) GOTO 5 

GOTO 2 

END 'OF MG' 

SUBROUTINE SMOOTHING (r^, u , pq) 
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1 


INTEGER pq. 

DO 1 1=1, pq 

6.U = RELAXATION SCHEME 

$ u 



CONTINUE 

RETURN $ END 'OF SMOOTHING' 


+ 


6u^ 


One call to subroutine MG (i, P. , u , r , p, m, q) perfonus i iterations 
of the basic multigrid cycle where is the number of levels, k (=£, 
u^* the initial solution at level Z (taken equal to zero in the present exam- 
ples) and r^ the corresponding residue at level £• The parameters p, q and m 
specify the multigrid strategy, m being the mamber of times the coarse level 
correction is entered consecutively. 


The only operators that remain to complete the description of this MG 
algorithm are the restriction, prolongation and coarse level equations A^- 


Let a panel distribution on level Z be denoted by h. (i=1, .--a n j, where 
h is the panel length and the number of panels. Define the coarse levels 
recursively by 


k-1 1k ^ k-1 

n = t: n and h. 

2 1 


_ , k . 
^2i-1 ^2i 


• • k / • 

Let the restriction operator R. . 

k ^ 

prolongation P. . (n^+n ), j=1 

ij ^ 


^ k-1v 

(x=1 , . . . , n ) . 

(n^~Vn ), j=1,..., (n^+n )) and the 
c c 

. (n^^^+n^.)) be defined: 


10 

= 

3 

/h^~^ for j=1 , 

. . . ,n^^ with i= IFIX (•'^) 

(23a) 

R^. 

13 

= 

1 

for j= (n^+1 ) , 

. (n^+n^) with i=j-n^~^ 

(23b) 

P^. 

ij 

= 

1 

™ • 1 k 

for 1=1 . . ,n 

with j= IFIX (^) 

(23c) 

P^. 

IJ 

= 

1 

for i=(n^+1 ) , . 

/ k . X • k-1 

. . , \n with 3 =i-n 

(23d) 

R^. 

13 

= 

P. . = 0 for all 
13 

other pairs of integers (i^j) 

(23e) 


If we let the fine level equations be given by equation ( 1 1 ) and be de- 
noted by , then we may define the coarse level equations recursively by 
(see ref. 10 ) 




= R^ P^, 


(2A) 


which choice has been motivated by the results of Wesseling (Ref. 1l), who 
found the Galerkin coarse grid approximation (eq. 2h) to be better than coarse 
grid discretization of the continuous problem. 
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MULTIGRID CONVERGENCE 


In this section we illustrate the convergence characteristics of the MG 
algorithm described above by applying it to a number of potential flow problems, 
restricting ourselves to the source method and the AE scheme, 

The first example pertains to the KT0012 profile, shown in figure 1, 
placed in a uniform flow and at an angle of attack of 20 degrees. The AF scheme 
used is characterized by the logical expression (22). The convergence history, 
the norm L of the residue as function of the number of fine grid residue 
evaluations, is shown in figure 6 where we have chosen n = 32, £ = 2, p = 1, 
q = 1 , m = 1 . Here we have defined 

L,(r‘) = II r'(v) II / 

/II r’- (V =0) II., (25) 

being the ratio of the maximum norm of the current (v) residual vector r^ and 

the maximum norm of the initial (v=0) residue, i.e. the right hand side of 

equation (11). The observed convergence factor in figure 6 is about twice as 

good as the global smoothing factor, which represents an upper bound of the 

convergence factor of the high-frequencies obtained from a one-level analysis. 

These findings indicate that A is a rather conservative estimate, although it 

is very realistic with respect to the effect of the nonzero pattern n . 

a 

Increasing the number of levels to 5s see figure T, does not change the 
asymptotic MG convergence rate, although the initial convergence improves 
somewhat. Using the AE scheme as a classical iteration procedure (fig. 7)j i.e. 
omitting the coarse level corrections , is found to be quite ineffective as 
should have been expected from the Fourier analysis. 

Let us define the computational work associated with one residue evalua- 
tion at the finest level as work unit, in order to be able to compare various 
multigrid strategies. Results are given in table U, indicate costs ranging 
from 2.0 to 2.7 work units per 10”^ reduction in the maximum nom of the 
residual vector over a range of strategies p, q with m = 1. Convergence to the 
level of the truncation error is obtained within 2 MG cycles. MG strategies 
with m = 2, i.e. entering the coarse level correction two times consecutively, 
are found to be computationally less efficient, see table h. 

The second example illustrates the convergence of the MG algorithm when 
applied to the problem of a wing plus flap configuration shown in figure 8. The 
convergence history is shown in figure 9* It is observed that the AF smoothing 
is quite effective for n > 1 , although we have just repeated the nonzero 
pattern for single-component airfoils given in expression (22) by applying it 
to each submatrix corresponding to the wing alone and flap alone for (i,j) ^ n. 
No zero entries for i > n or j > n have been created. This zero pattern, which 
neglects all connections in the matrix between wing and flap for (i,j) < n, 
results in acceptable AE smoothing for a gap of 2.6 percent. However in the 
limit of vanishing gap size for a fixed paneling we would, of course, have to 
take some nonzero entries representing the flap/wing connections into account 
if acceptable AF smoothing is to be retained in this limit, as has been con- 
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firmed iDy numerical experiments . These observations clearly lead to the re- 
quirement that all near field connections have to be taJs:en into account during 
the construction of a particular AF smoothing scheme. 

CONCLUSION 

Approximate factorization (AF) relaxation schemes provide a smoothing 
capability that allows one to construct a fast multigrid method for solving 
integral equations of the second as well as equations of the first kind. 

The local mode analysis of Brandt (Ref. h) is applied for the special 
cases of an unbounded flat plate and circular cylinder and predicts the qual- 
itative difference between multigrid problems of the first and second kind, 
where the foimer has a smoothing factor independent of h and the latter a 
smoothing factor proportional to h. For more realistic geometries, having 
surface slope discontinuities such as airfoils, Fourier analysis predicts no 
qualitative difference between smoothing factors obtained with the AF scheme 
when applied to integral equations of either the first or second kind. 

Niimerical experiments show that convergence to the level of the truncation 
error of a second order accurate integral method can be obtained within 2 MG 
cycles. 
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APPENDIX 


Let a residual correction iterative scheme to solve the matrix equation, 
Au = f , be given by 



. (v+1) ~ (v) 

6u = A r , 

(A-1) 



(A-2) 

and 

, /V) _ ^ 

(A-3) 


with the iteration index v (=0, 1, 2,...). Setting the initial solution u 
equal to zero results in an initial residue r(0) equal to the right hand side 
f. The column vector 5u(^*^l) is the correction to the approximate solution 
u(v) and A is an approximate inverse of A. This inverse is either constructed 
(NRS scheme) or implied by the AF scheme. For the latter case we have 

A = (LU)"^ . 


This iterative scheme results in an error amplification matrix, M , 
defined by 


(v+l) .. / (v) V 

u - u = M (u - u) , 

e 


v=0, 1, 2,..., (A-4) 


which reads: M = I - 'KA. The corresponding residue amplification matrix, 
defined by 


(v+1) ^ (v) 

r = M r . 

r 


v=0, 1,2,... 


(A-5) 


is equal to: Mp = I - AA. In case A and A are either circulant matrices or 

inifinite Toeplitz matrices one finds Ka = PlK and M^ = Mr. For a more general 

matrix A resulting from the airfoil problem, equation (11 ), we have chosen to 

analyze the residue amplification matrix Mp. The choice of the zero pattern 

( 22 ) in the AF scheme when applied to the (n + n^) -dimensional matrix equation 

( 11 ) results in a n-dimensional matrix M = I - aS. 

r 
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TABLE 1 THEORETICAL SMOOTHING FACTORS FOR DOUBLET METHOD 
WITH NORMAL-VELOCITY BOUNDARY CONDITIONS 



Unbounded flat plate 

Cir c • cylinder 

NRS'' 
eq. ( l6) 

NRS 

eq. ( 16 ) 

3-P, . 
AF 

eq.(l3) 

5-P 

AF 

eq. ( 13 ) 

5-P 

AF 

eq. ( 18 ) 
n=512 

5-P 

AF 

eq.(l8) 

n=128 

na =0 

0.415 

0.797 

0.415 

0.797 

0.795 

0.790 

na — 1 

0.4o6 

0.534 

0. 163 

0.0227 

0.0232 

0.0249 

na -2 

0.321 

0.421 

0.o422 

0.0251 

0.0252 

0.0255 

na 

0.261 

0.326 

0.0156 

0.0102 

0.0102 

0.0103 

na =T 

0.202 

( 

0.256 

0.0052 

0.0037 

0.0037 

0.0037 


^ 3-point differences 
5-point differences 
^ Natural Relaxation Scheme 
Approximate Factorization 


TABLE 2.- THEORETICAL SMOOTHING FACTORS FOR SOURCE METHOD 

APPLIED TO CIRCULAE CYLINDER [equation (l8); n =0] 

a 


n =128 

n=256 

n=512 

0.00792 

0.00397 

0.00199 


TABLE 3.- GLOBAL SMOOTHING FACTORS OF APPROXIMATE FACTORIZATION (AF) 
FOR SOURCE AND DOUBLET METHODS APPLIED TO KT0012 PROFILE 



SOURCE METHOD 



DOUBLET METHOD (5-p) 

n= 

32 

64 

128 

256 

32 

64 

128 

256 

na =0 
na =1 
na =2 
na =4 

1.35 
0.49 
0.32 
0. 19 

1.36 

0.54 

0.36 

0.23 

1.35 

0.55 

0.34 

0.24 

1.34 

0.56 

0.34 

0.23 

1.75 

0.29 

0.24 

0.20 

1 .83 

0.24 

0.23 

0.13 

1.87 

0.22 

0.22 

0.11 

1 .89 
0.23 
0.22 
0.10 
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1 Cfl ^ 

TABLE 4.- WORK UNITS PER 10 REDUCTION ^ ' IN THE RESIDUE OVER A RANGE 
OF STRATEGIES. [AF smoothing, n =1; KT0012, a=20°, n=256, )l=5, i=4] 

SI 


p 

q 

m 

Work 

units 

per 

...Aigit 


P 

<1 

m 

Work 

Units 

per 

digit 

0 

1 

1 

2.7 


2 

1 

1 

2.5 

0 

2 

1 

2.2 


2 

2 

1 

2.3 

1 

0 

1 

2.2 






1 

1 

1 

2.2 


0 

1 

2 

4.3 

1 

2 

1 ^ 

2.2 


1 

0 

2 

5.3 

2 

1 1 

0 1 
! 

1 

2.0 


1 

1 

2 

3.6 


Average values over last 2 MG cycles of a total of U MG cycles. 



Figure 1,- Pressure distribution of 12 percent thick Karman-Trefftz profile 
with 15^ trailing edge angle; Discretization error of this solu- 
tion as function of the number of panels. 
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(b) 


Figure 2.- Convergence factor as function of frequency for doublet method 

(equation (13) with 5-point differences); n =0 and 1, AF scheme 

a 


LOG .nip) 


LOG iq(/?) 


(a) ng=2 


(b) ng=4 


Figure 3.- Convergence factor as function of frequency for doublet method 
(equation (13) with 5-point differences); n =2 and 4, AF scheme 
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Row SLim of amplification matrix G as function of frequency for 

source method; KT0012j n=256, n =0 and 1, KF scheme. 

a 


LOG _(Ai 


LOG ( A [^ ) 


(a) ng=0 




ng=i 




Figure 5-- Row sum of amplification matrix G as function of frequency for 
doublet method; KT0012, n=256, n =0 and 1, AF scheme. 








Figure 6 


Figure T 


L^(RESIDUE) 



KT0012 PROFILE 
INCIDENCE O'=20° 
FINEST LELEL n-32 
NO. OF LEVELS 1=2 
p=1, q =1 , m =1 
SOLID LINE : RELAXATION 
DASHED LINE : COARSE 

LEVEL CORRECTION 



GLOBAL 

SMOOTH. 

FACTOR 

OBSERVED 

CONV. 

FACTOR 

0 

1.35 

0.80 

1 

0.49 

0.23 

2 

0.32 

0.18 

4 

0.19 

0.12 


-NO. OF FINE LEVEL RESIDUE EVALUATIONS 


Convergence history of MG algorithm; 2 levels, source method, 
AF scheme. 
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KT0012 PROFILE 
INCIDENCE U=20® 

FINEST LEVEL n^256 
NO. OF LEVELS €=5 

SOLID LINE: RELAXATION 
DASHED LINE: COARSE 

LEVEL CORRECTION 

CLOSED SYMBOLS: p=12 
OPEN SYMBOLS 
p = 1,q=l, m=1 


-NO. OF FINE LEVEL RESIDUE EVALUATIONS 


Convergence history of MG algorithm; 5 levels, soiarce method, 
AF scheme . 
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Figure 8.- Pressure distribution of NLR 7301 plus 32 percent flap at 6 degrees 
incidence . 


NLR 7301 PLUS FLAP 
INCIDENCE a=6° 

FINEST LEVEL 

ON WING n=128 
ON FLAP n=80 
NO. OF LEVELS 5 

p = 1 . q = 1 , m =r 1 

SOLID LINE : RELAXATION 
DASHED LINE : COARSE 

level CORRECTION 


►NO. OF FINE LEVEL RESIDUE EVALUATIONS 

Figure 9*- Convergence history of MG algorithm; 5 levels, source method, AF 
scheme . 
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UNIGRID METHODS FOR BOUNDARY VALUE PROBLEMS WITH NONRECT ANGULAR DOMAINS 


W. Holland 

National Center for Atmospheric Research 
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1. Introduction 

Multigrid methods are generally very effective for solving differential 
boundary value problems. This is true because the smooth error, which is 
slow to converge during relaxation, is reduced by iterating on the problem 
projected onto coarser grids, where relaxation is both cheaper and more ef- 
ficient. Fine grid relaxation can then be viewed as an attempt to eliminate 
the high frequency error. 

In lieu of coarse grid iterations, one can, in fact, modify the fine 
grid relaxation process in order to reduce the smooth error directly on the 
fine grid (i.e., without the use of coarser grids at all). Under certain 
assumptions (see section 3), the resulting method, so-called [1] , 

is theoretically equivalent to conventional multigrid but has significantly 
different computational characteristics . For example, unigrid requires less 
storage and shorter code, but significantly more arithmetic work. More im- 
portantly, it is much easier to apply to a given problem because most of the 
design work for the grid transfers and coarse grid operators is automatic. 
Thus, existing software packages that solve possibly very complex problems 
by SOR, for example, can be easily modified for application of unlgrid. This 
can usually be done by making a few changes in the relaxation routine without 
impacting any of the other software routines or data structures. These fea- 
tures make unigrid effective as a multigrid software simulator for quick and 
easy determination of the applicability of multigrid to a given problem. 

Unigrid is developed in section 2, its relationship to multigrid is des- 
cribed in section 3, some simple theory is presented in section 4, and its 
use is illustrated with a North Atlantic basin oceanographic model problem 
in section 5. This application demonstrates how unigrid (and, hence, multi- 
grid) can be used efficiently with vector computers on problems with irregu- 
lar domains . 


2. Unigrid 

Assume given the d-dimensional operator equa.tion: 
(2.1) AU = F, U e 
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where A: f/^ H 2 is a linear operator and and H 2 are appropriate Hilbert 

spaces of functions defined on a region in R^, d > 2. Assume that (2,1) 
admits discretizations by a family of matrix equations, parameterized by 
adrrU^6^bl^ gn^d h > 0 and given by: 


( 2 . 2 ) 




h n*”^ h d 

where H = R and n is an integer (approximately proportional to h ) . Up- 
per case will denote the exact solution of (2.2) and lower case u^ its ap- 
proximation, The grid transfers are full rank linear operators, represented 

by Ir: that satisfy the consistency condition for ad- 

n n h h 

missible h, h’, h when h < h’ < h or h > h’ > h. 

The objective is to reduce the error from a current approximation u in 

the subspace defined by a set of directions = (d^ , d«, ...d Letting 

h -L z n 

D = [d , d , ... d ], then a Ritz projection can be performed that corrects 
h ^ ^ h 

u by a function in the space of P , so that the projection of the resulting 
residual over the subspace is zero. This leads to the problem of finding 

T 

some s - (s^ , .... s ) so that: 

1 n 

[A^ (u^ + D^s) - f^] = 0 

This can be rewritten as: 

h^ h h P h ^7 

D ADs = D [f - Au] 

Gauss-Seidel relaxation on this system with some initial approximation s and 
a new approximation s can be written as: 

s. = (f^ - - Z s.A^d. - Z s.A^d., d.)/(A^d., d.) 

1 .^'3 3 11 

j>x ^ J<1 


i = 1, 2, . . .n 


u can then be corrected by: 


h h h- 
u u + D s 

h h 

If A^ is linear, then corrections can be made to u directly, rather 


than to s, resulting in the dJjiZOJtLOYioX. ZtCAoJxon 
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(2.3) 


/ A ^ ^ J \ 

h h , 

u ^ u + r d. 

(Ad . , d . ) ^ 

I X 

where left arrow denotes replacement. Rewriting (2.3) as: 

(2.4) (u^, d) 

then one 6 (A)Ztp with initial guess consists of iterating with (2.4) in se- 
quence over d, , k = 1, 2, ...n^. For example, Gauss-Seidel is specified by 

^ h th h 

the choice “ ^^9 the k coordinate vector in H . 

To define unigrid for a given admissible grid size h = suppose 

m > 1 is an integer so that are admissible, q < m. Now define the 

direction sets for unigrid according to: 


(2.5) 


H H H 

= (d^^, d^^, 0<q<m, 

^ n q " 


H ^ H 

where d, ^ = I., Thus, the directions on Z,(l\J(lZ q are just the relaxation 

k H k n j 

q 

directions on grid transferred to grid h = 

One of the many possible unigrid schemes is described in terms of the 
relaxation parameters v and and the cycling parameter y. The unigrid 

cycles are then defined recursively by: one unigrid cycle on level q con- 

H 

sists first of v unigrid relaxation sweeps via (2.4) with directions d, 

H ^ 

k=l, 2, ...n^, followed for q < m by y cycles on level q+1 and for i = m 

by V more sweeps via (2.4). 
c 


Remark The directions defining unigrid depend not directly on the operator 
A but rather on the domain Q. Using linear interpolation, then these direc- 
h th — 

tions d^ are in fact the i grid h coordinate vectors interpolated to grid 
Hq. In one-dimension, this is illustrated by the following figures. 
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d 


2h 

2 




For rectangular Q in two-dimensions, each direction is a product of two such 
functions, one in x and the other in y, resulting in the **tent" function des- 
cribed as follows. 


h h h h 

With the usual double subscript notation and n = N x N , then d 


h h h , ^ 

^ for h = H^, l<k, £<N . The coarse grid directions are defined so 

H 


that the i,j component of is: 


H 


(2 7) d ^ (± i) = ' lk-i|)(2^ - U-jl), |k-ij,l£-j| < 2 

^ k,i^^ ^ 0, otherwise. 


This assumes that the point denoted by (k,il) is a point of the H 


q 


grid. 


In irregular regions where boundaries do not lie on coarse grid lines, 
there are several options possible for treating these boundaries. The most 
obvious, which is analogous to the usual multigrid approach, is to define the 
directions as the interpolated coarse grid coordinate vectors and use the 
(zero) boundary conditions properly in interpolation. This is illustrated in 
one-dimension by as in: 



Note that this requires special handling of the coarse grid points that are 
adjacent to the boundary. Another approach is simply to ignore those direc- 
tions which would overlap the boundary so that d^ is suppressed as in: 
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In section 5, this will be referred to as the aOYVtAJOi(iX.zd boiindoJiy method. 

This means that some points near the boundary are not corrected by 
smooth error iterations, so the danger is that convergence is slowed (see sec- 
tion 5) . 

Another possibility is to enlarge the region, so that it is aligned 
with the coarse grid directions, but ignore correcting that part of the ex- 
panded region, that does not lie in the interior of Q, This is illustrated 



Note that this method, which is here called the e,xpand^d boundcUiy approach, 
does not require extra information at the boundaries so the directions can be 
computed once for each grid over the entire domain Q and stored in the form 
of a matrix stencil. 


3. Multigrid 


One multigrid cycle on problem (2.2) with present approximation u , 
right-hand side f^, and h = , is denoted by MG^(u^,f ) and defined recur- 

sively by: 

For q = m, MG, (u^, f“) consists of v + v relaxation sweeps via 
n ^ h 

(2.4) with directions e, , k = 1, 2, ..., n . 
h — h ^ 

For q < m, MG, (u , f ) consists of: 
h 

Step 1 . Perform v relaxation sweeps via (2.4) with direc- 
tions e?, k = 1, 2, ..., n^. 
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o T ^ jh .h h 2h T-2h h 2h _ , 

Step 2 . Let r = f - Au,r =I r,u "^O, and per- 
form u grid 2h cycles via MG„, (u^^, with = 

2h 

r 

cij. o n ^ ^ h _h 2h 

Step 3 . Set u u + l 2 h'^ 

Multigrid is theoretically equivalent to unigrid if, as is henceforth 
assumed, the formulation of the coarser grid equations satisfies the VOJvLa.- 
tA-onat cond<Xlon6: 


(3.1) 


2h T2h,h^.h 

\ ^2h 


( 3 . 2 ) 


., 2 h h,,h ,T 
\ ■ “ <^2h> ’ 


where a is a scaler. To see this, consider the following ^me^dicute. ^2,pZaC2.- 
mdvit multigrid algorithm. The method depends directly on the fine grid right- 

hand side f ^ and its cycles on grid h = are denoted by MGIR^(f^) , where q 
is used in place of as a subscript or superscript. It is characterized as 

a modification of conventional multigrid applied to (2.2) in which all coarse 
grid changes are immediately reflected in the fine grid approximation and the 
fine grid residual is recomputed and used to redefine the coarse grid equa- 
tions. The algorithm is defined in terms of MGIR^(f^) by: 

For q = m, MGIR (f^) consists of performing ^ + v relaxation sweeps 


0 0 ^ 
u •<- u + 


<Io(f°- A°u°), H 

T e 

H H -"q k 


k = 1 , 2 , . . . , n^. 


For q < m, MGIR (f ) consists of v relaxation sweeps via (3.3) followed 

^ 0 
by y levels q - 1 cycles via MGIR^_^(f ). 

Note that the immediate fine grid correction is incorporated in the relaxa- 
tion scheme. This scheme on a level q > 0 is just (2.3) with u^ = 0, 0 < p < 

q, and r^ = ^0^^^ ” A^u^) , followed by interpolation of the correction direct- 
ly to the finest grid. 


It is not difficult to see [1] that MGIR is fully equivalent to MG un- 
der condition (3.1) - (3.2). This is done by noting what the status of in- 
termediate MG calculations would be if coarse grid changes were immediately 
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reflected in the fine grid approximation. By examining the iterative formu- 
lae, it is easy to see that MGIR and unigrid are identical, from which it fol- 
lows that multigrid designed according to the variational conditions (3.1) - 
(3.2) is theoretically equivalent to unigrid. 

Unigrid code is typically very compact, partly because it lacks the 
modular structure of multigrid software. This is one reason that unigrid 
code can be developed very quickly. Also, there are fewer design choices 
with unigrid, since the coarse grid and grid transfer operators are automat- 
ically determined. This also adds to ease of programming, but restricts the 
flexibility of the method. The design of unigrid also guarantees convergence 
independently of the choice of the coarse level iteration directions and cy- 
cling scheme, so mistakes may slow convergence but do not result in diver- 
gence as often as for multigrid. 

This ease of programming and small program size makes unigrid an effec- 
tive method to test the convergence behavior of multigrid for many applica- 
tion problems. It can easily be used to replace the usual relaxation of 
direct solvers in existing programs in order to perform such a feasibility 
test. Of course, the amount of work involved makes any comparison of solu- 
tion time meaningless, but actual multigrid efficiency can be determined by 
applying the usual multigrid operation counts to the unigrid cycling scheme. 
Since the methods are equivalent in terms of results when multigrid is im- 
plemented according to (3.1) - (3.2), then unigrid will accurately represent 
the numerical performance of such a variationally formulated multigrid 
scheme . 


4 . Theory 

Assuming that A^ is symmetric and positive definite, define the e^nOAgy 
ZnnOA product and non.m on by: 

h h ^h h h 

<x,y>^=*<Ax,y> 


and 



<A X , 


h^l/2 

X > ' 


h h h 

respectively. Let W. denote the set of all A -unit eigenvectors of A whose 

^ h h 

eigenvalues are no larger than X and let y , denote its A -orthogonal comple- 
ment. Let G denote one pass of (2.4) over the fine grid directions P and 
assume it spans H^. For each integer v > 1, define as the restriction 
of (A^) ^^^( (G^)^)"^A^(G^)^(A^) to (For Jacobi-type versions of 

(2. A), this latter operator simplifies to (G ) .) Then, with 2m the degree 
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of the differential operator in ( 2 . 1)5 assume (cf., [2]): 

Al . There exists constants a > 0 and independent of h and so 

that: ^ 

ll»" - 8(4)11 

A 

h h h 

for all admissible h and all w e W , , where R(I^, ) is the range of 

h h ^h 

of l2h‘ (Note that R(l2h^ ~ Span (P^), the coarse grid directions . ) 

A2 . There exist constants c- , c_ > 0 with c^ < 1 and c„ < (c^ + l)h 

h -L^ -L J. 

p (A ) , where p(A^) is the spectral radius of A , so that: 

p(G^ < max{c^5 (l - c^Xh^^l"^}. 

THEOREM ■ Suppose y > 1 and m and are so that the error from coarsest 

level does not significantly contribute to the finest level error. Then 
there exists a v independent of h so that unigrid converges to the solution 
of (2.2) by a fixed linear rate independent of h. 

PROOF. This theorem follows from the results of section 3 that relate uni- 
grid to multigrid and from the theory of [2] slightly modified to account 
for the class of relaxation methods depicted in (2.4). 

Relaxation does not generally minimize the residual error, although it 
should approximately. In fact, when direct application of unigrid to (2.2) 
exhibits convergence but does not monotonically reduce the residual error on 
the coarse levels, this is a signal that the directions for relaxation are 
improperly defined. They should be chosen to approximate the smooth eigen- 
vectors of A^, that is, those that belong to the lower end of its spectrum. 
This would ensure that relaxation quickly eliminates the O^CyiZZcuto^y eigen- 
vector componentSj^of the error with little effect on the smooth ones. Since 
the spectrum of A that corresponds to these oscillatory components is 

narrow, then there is a close relationship between error in the 
energy norm, for which relaxation is a minimizer, and the residual error 
norm. The residual norm is not generally by relaxation, but a 

proper choice of directions coupled with a good smoothing rate ensures that 
it will be monotonically 

5. Numerical Results 

This section contains a report on numerical experiments with unigrid 
applied to the solution of the model problem: 

(5.1) -V^u + Au = f in 

u = g on dU 
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where Q is an irregular domain used to describe the North Atlantic basin. In 
this case, Q is rectangular on three sides but irregular on the fourth, as 
depicted by: 



A is a given function which is set to the constant 64 in the following exper- 
iments. (Such a value for A results in strong positive definitiveness of the 
operator in (5.1), leading to very fast convergence rates for multigrid. 
However, such a value is fairly realistic for this application and sharply 
depicts the disadvantage of using the contracted boundary method described 
in section 3.) In these experiments, the fine grid spacing is h = 0.0625 
and the rectangle encompassing Q is [0,3] x [0,2]. In each case, a very 
simple grid cycling scheme with four grids is used, where each cycle involves 
three relaxations, each performed in turn on grids 8h, 4h, 2h, and h. Four 
cycles are made for each of the three problems, with u = 0 as the initial 
guess. The usual central fine point stencil was used to discretize (5.1). 


The main feature of the discretization of (5.1) is that the boundary is 
enforced to pass through grid h vertices. Although this represents only an 
approximation to the actual boundary (of reduced order) , it has conservative 
properties that are not easily obtained any other way. More specifically, 
conservation of kinetic energy, vorticity, and ens trophy in a dissipationless 
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finite difference discretization of atmospheric diffusion problems can be 
easily guaranteed when the grid points and irregular boundary points coin- 
cide (cf., [3]). However, although this is an advantage when used on a vec- 
tor processor, coarse grids in the usual multigrid process will not generally 
share this simplified property. The question then is whether or not one of 
the means for preserving this feature on coarser grids (namely, boundary con- 
traction or expansion) will maintain the efficiency of the usual multigrid 
process. Such is the objective of the experiments reported in this section. 

To compare the contracted and expanded boundary methods with the usual 
multigrid, unigrid was used on the Cray 1 at NCAR as a simple tool to simu- 
late multigrid performance. Instead of comparisons with the usual multigrid 
on the irregular region, it was much simpler to compare the two methods with 
the analogous (i.e., naturally extended) problem defined on the entire rec- 
tangle [0,3] X [0,2]. Thus, a function U on this rectangle was chosen to 
determine f and the usual unigrid algorithm was run on the full rectangle. 

The results are depicted in the first column of the table. Both the con- 
tracted and expanded boundary methods were also tried with the same f, but 
with f restricted to the irregular region Q. The results are depicted in 
the second and third columns of the table, respectively. Note the severe 
degradation in convergence for the contracted boundary method. As might be 
expected, however, there is almost no loss of efficiency with the expanded 
boundary approach. 

Although these are admittedly very limited experiments, they represent 
the numerical experience with several such tests that were conducted. Gen- 
erally, although full multigrid (FMG) vastly and expectedly improves the 
performance of the contracted boundary method, it remains somewhat less ef- 
ficient than conventional multigrid. On the other hand, the expanded boun- 
dary method seems generally as (or nearly as) efficient, and therefore, 
preferrable to the usual multigrid approach, especially for use on vector 
processors such as the Cray 1. 
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DYNAMIC RESIDUAL ERROR 

Cycle 

Number 

Relaxation 

Level 

Multigrid on 
[0,31 X [0,2] 

Contract 
Multigrid 
On Q 

Expanded 

Multigrid 

On 


8h 

7.402E+03 

1.197E+03 

2.745E+02 

6.027E+03 

1.147E+03 

1.753E+02 

7.418E+03 

2.465E+03 

1.037E+03 

1 

4h 

2.617E+03 

7.758E+01 

3.416E+00 

2.384E+03 

8.864E+01 

3.927E+00 

2.364E+03 

2.598E+02 

2.187E+02 

J. 

2h 

4.270E+02 

8.350E+01 

2.200E+00 

5.663E402 

1.359E+02 

4.139E+01 

4.311E402 

1.002E+02 

2.602E401 


h 

1.090E+02 

1.784E+01 

7.771E+00 

2.716E+02 

8.348E+01 

3.836E+01 

1.437E+02 

2.630E+01 

9.057E+00 

1 

1 

8h 

3.620E+01 

9.004E+00 

2.383E+00 

3.602E+01 
6 . 381E+00 
1.145E+00 

3.749E+01 

9.451E+00 

4.826E+00 

2 

4h 

3.236E+01 

8.235E-01 

3.198E-02 

3.607E+01 

l,383E+00 

6.425E-02 

3.166E+01 

4.491E+00 

2.348E+00 


2h 

6.099E+00 

1.139E+00 

2.956E-01 

4.050E+01 

1.006E+01 

3.073E+00 

6.358E+00 

1.373E+00 

3.161E-01 


h 

1 . 363E+00 
2.232E-01 
1.048E-01 

2.030E+01 

9.095E+00 

4.764E+00 

1 

2.116E+00 

4.554E-01 

1.694E-01 
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DYNAMIC RESIDUAL ERROR 

Cycle 

Number 

Relaxation 

Level 

Multigrid On 
[0,3] X [0,2] 

Contract 
Multigrid 
On ^2 

Expanded 
Multigrid 
On 0 . 


8h 

3.602E-01 

8.874E-02 

2.134E-02 

9.720E-01 

3.538E-02 

7.684E-03 

5.507E-01 

1.325E-01 

9.404E-02 

3 

4h 

3.599E-01 

8.122E-03 

3.198E-04 

3.644E+00 

1.276E-01 

6.083E-03 

4.807E-01 

1.743E-01 

1.064E-01 


2h 

1.127E-01 

2.044E-02 

5.259E-03 

4.954E+00 

1.126E+00 

3.595E-01 

1.354E-01 

3.299E-02 

9.587E-03 


h 

2.329E-02 

4.590E-03 

2.076E-03 

2.687E+00 

1.272E+00 

7.181E-01 

4.209E-02 

1.035E-02 

4.465E-03 


8h 

4.434E-03 

9.947Et04 

2.325E-04 

1.617E-01 

4.012E-03 

5.582E-04 

8.480E-03 

4.178E-03 

3.872E-03 

4 

4h 

4.774E-03 

1.055E-04 

4.965E-06 

6.814E-01 

2.275E-02 

1.167E-03 

9.658E-03 

4.446E-03 

3.061E-03 


2h 

2.096E-03 

3.632E-04 

9.150E-05 

7.891E-01 

1.697E-01 

5.688E-02 

3.475E-03 

7.559E-04 

2.512E-04 


h 

6.079E-04 

1.561E-04 

6.461E-05 

4.213E-01 

2.089E-01 

1.243E-01 

1.248E-03 

3.633E-04 

1.590E-04 
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BLACK BOX MULTIGRID 


J- E. Dendy, Jr. 


Los Alamos National Laboratory 


Abstract 


One major problem with the multigrid method has been that each new 
grid configuration has required a major programming effort to develop a 
code that specifically handles that grid configuration. Such a penalty is 
not required for methods like SOR, ICCG, etc.; in these methods, one need 
only specify the matrix problem, no matter what the grid configuration. 
In this paper we investigate such a situation for the multigrid method. 
The end result is a code, BOXMG, in which one need only specify the (logi- 
cally rectangular, positive definite) matrix problem; BOXMG does every- 
thing else necessary to set up the auxilliary coarser problems to achieve 
a multigrid solution. 

I . INTRODUCTION 


In the multigrid method, one attempts to solve a discrete approxi- 
mation 


lV = f” (1) 

to a continuous equation 

LU = F . (2) 


To do this one constructs a sequence of grids G , . . . , G with correspond- 
ing mesh sizes > ... > h^. In its simplest mode of operations, one 

does a fixed number, IM, of relaxation sweeps (Gauss-Seidel , for example) 

M-1 

on equation (1) and then drops down to grid G and the equation 
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where ^ is to be the coarse grid approximation to - u^, where 

M 1^ 14* 1 

v” = u is the last iterate on grid u, and where is an interpolation 

M M** 1 ^ 

operator from g” to G” . To solve equation (3) approximately one resorts 

to recursion, taking ID relaxation sweeps on grid G before dropping down 
k*l 

to grid G , M-1 > k > 2 and the equation 


^k-1 k-1 .k-1^ k. 

L V =f Hl^(f-Lv). 


(4) 


When grid G is reached, the equation L v = f can be solved directly and 

2221 k-1 

V + I-v performed. Then one does lU relaxations sweeps on grid G 

^ k k k k- 1 

before forming v v + Ij^_^ v , 3 < k < M. (This description assumes 

M > 3, the cases M = 1 or 2 being trivial.) 

One advantage of the multigrid method is that one obtains a fixed 

reduction of the error, significantly less than one, in the residual 

- L^u^ per work performed per unknown on grid G^. This is in shairp 

contrast to most iterative methods, for example, SOR, where the reduction 

14 

increases as a function of the number of unknowns on grid G . Another 

advantage is that in many cases, multigrid achieves truncation error in 

work that is a small multiple of the number of unknowns. For further 
details, see references 1 and 2. 

In most implementations of the multigrid method, the operators I, 

k k-1 

have been grid dependent. In the simplest case, G and G are rectangu- 

k^ I k 

lar grids, the grid points of G are a subset of the grid points of G , 

k-1 k 

the grid spacing of G is twice the grid spacing hj^ of G , and the 

interpolation I, is bilinear. (See Reference 1.) If there are always 
k 

to be G grid points on the boundary, then there is a constraint on the 
number of x[y] grid points NXM[NYMl on g“ that NXM = (NXO - 1 ) 2 “"^ + 
1 [NYM = (NYO - 1)2^ ^ + 1] , where NXO[NYO] is the number of x[y] grid 
points on G^. Otherwise, interpolation near the boundary is a special 
case. The coding of interpolation is further complicated by whether the 
points on the boundary represent knowns (as in Dirichlet boundary condi- 
tions) or unknowns (as in Neumann boundary conditions) . 

Figure 1 shows two grids for a cell centered approximation to an 

k k-1 

elliptic equation. (The x’s represent G and the fl's G .) Now the 
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above constraint on unknowns doesn't help since the nearest G point to 

k-1 

the boundary is h./2 from the boundary and the nearest G point to the 

^ 12 
boundary is from the boundary, where and hj^ are G and G mesh 

spacings, respectively. The incorporation of a Neumann boundary condi- 

k 

tion, for example, on grid G leads to frequencies which are not damped 
out by bilinear interpolation, and convergence is degraded. Again some- 
thing special (either in the interpolation routine or the relaxation 
routine) must be done at the boundary. This is easy in principle - 
especially if Brandt is nearby to advise - but is a pain in practice. 
There are two possible solutions in this case. One is to let 

which again leads to the coding of a special interpolation. The other is 

k“ 1 k 

to use G unknowns that are not a subset of G unknowns, as in figure 2. 

This latter solution was the one employed in reference 3. Bilinear inter- 
polation in this case involves special coding (for example, a = ^(9A + 

3C + 3B + D) in figure 2), and there is again a constraint on the number 
M 

of G unknowns to avoid special cases. 

In addition to the grid structure, the actual difference equations 
cause programming difficulties. Consider, for example, 

(-V • (D(x,y) VU(x,y)) + a(x,y) U(x,y) = f(x,y), (x,y) e n . 

|v(x,y) • D(x,y) V U(x,y) + v(x,y) U(x,y) = 0, U,y) e 3Q, ^ 

where fJ = (0,A) x (0,B) with boundary V is the outward normal to 9fi, D 
is positive, a and y are non-negative, and D, a, and f are allowed to be 
discontinuous across internal boundaries F of Q; hence it is also assumed 
that 


U and u • (DVU) are continuous at (x,y) for almost every 
(x,y)e r (where |j(x,y) is a fixed normal vector at (x,y).) ^ 

If the finite difference approximation of equation (5a) is a vertex 

centered one as in ref. 2, then the "classic" multigrid method of 
k k-1 

Reference 1 (I, - = bilinear interpolation, I, = a fixed nine point 

weighting operator, and the coefficients of L a fixed weighting of the 
k 

coefficients of L ) performs well as long as the discontinuities in D are 
not too severe and as long as F doesn't consist of too many line segments; 
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otherwise, it performs badly; indeed, it can even fail to converge in the 
fixed mode described above. 

Reference 2 dealt with the situation in which D, a, and f jump by 

orders of magnitude across F. It considered many possible choices of 
k k-1 k-1 

1 ^ • Only one of these choices was found to be robust. 

^ k k k 

This choice was = *^k-l '^k-l defined below), 

and 

= dk_i)* • (7) 

The choices, equations (6) and (7), are automatic in the finite element 
formulation of multigrid (ref. 4). References 4 and 5 both observed that 
^ ^ ^k-1’ with ij^ ^ not necessarily equal to is ^ good 

choice in that the residual of the corrected solution vanishes when trans- 
f erred to the coarse grid. This can be shown to be a good feature if L 
is symmetric. In the finite element formulation of multigrid, is 

also automatic. Indeed multigrid finite element with piecewise bilinear 
elements was one of the methods considered in reference 2 and found not to 
be robust. 

The crucial choice, then, given equations (6) and (7) is the choice 
of I, -. As discussed in reference 2, the first clue to the choice of 

k k-i ’ 

Ir^I was that, because of equation (5b), should approximately pre- 

serve the flux p • (DVU) across F. In certain problems, however, when 

there were large jumps in both D and a, it was discovered that on coarser 
2 2 

grids where h is large, the interfaces in ah were as important as the 
interfaces in D. The obvious solution is to use the difference operator 
for the interpolation operator space dimension with three 

point difference operators, it is obvious how to do this. In two space 
dimensions, for the five point discrete Laplacian, it can also be done 
easily by the use of skewed five point discrete Laplacians; see 
Reference 6. This approach is doomed for equation (5) for two reasons. 
First, accurate skewed approximations are difficult if not impossible when 
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M 

interfaces are present. Second, even if L is a five point operator, the 
use of equation (7) generates nine point L 's, k < M, making the above 
approach impossible. The solution arrived at in reference 2 is as 
follows: Suppose that at (IF, JF) , L has the pointwise template 


-T^ 

IF, JF+1 

“W^ 

IF, JF+1 

^IF+1,JF+1 

^IF,JF 

“lF,JF 

^IF+1,JF 

_Ijk 

IF,JF 

IF,JF 

mk 

" IF+1,JF 


Form Q 1 J .+1 jj. = ^IF+1,JF’ ®IF+1,JF ~ '^IF+1,JF 

ok + H 

IF+1,JF IF+1,JF+1’ ^IF+2,JF IF+2,JF ^IF+2,JF IF+2,JF+1* 


Then for horizontal lines embedded in the coarse grid, 


~k k-1 jr k-1 

k _ ^IF+l.JF ^IC,JC ^IF+2,JF '^IC+l.JC 

’^IF+1,JF ■ 

IF+l.JF 


(9) 


(We have just summed equation (8) vertically to average out its y-depend- 
ence,) A similar formula can be used for vertical lines embedded in the 
coarse grid squares. Then, at fine grid points centered in coarse grid 

]j^ 

squares, JF+1 obtained from the difference formula; i.e., 

k _ r^k k + ^ 

^IF+l,JF+l '■^IF+1,JF+1 '^IF+l.JF+l ^IF+1,JF+1 ’^IF+2,JF+1 


+ tjk k + ^ 

IF+1,JF+1 '^IF+1,JF IF+l,JF+2 IF+l,JF+2 


pk k pk k 

IF+1,JF+1 '^IF,JF IF+2.JF+2 '^IF+2,JF+2 


( 10 ) 


253 


+ 1 /S^ 

IF+l,JT+2 IF,JF+2 IF+2,JF+1 IF+2,JF''' IF+1,JF+1 . 

The vertical analogue of (9), (9) and (10) constitute the definition of 
alluded to immediately preceding equation (6) above. 

The near ultimate insult is a cell-centered difference approximation 
to equation (5) using = 3hj^ or the grid structure of fig, 2, The 

definition of which approximately preserves flux across V in this 

^ k k k 

case is not obvious, and the computation of ^ ^k-1 ^ disaster. 

Desperation being the mother of invention, one soon decides there has to 

be a better way. 


II. BLACK BOX MULTI GRID 

The better way has already been described; one only needs to inter- 
pret it differently. The crucial observation is that once one has a 

M k 

pointwise template like equation (8) for L , then the definition of J, 

k k k 

and ^ '^k-l independent of where this template came from. (We 

refer to this method as black box multigrid not because - as some would 
have it - multigrid is black magic but because the code which implements 
the method acts as a black box for the user; he need only specify the 
difference equations on the finest grid since the code, BOXMG, generates 
the auxilliary coarse problems. 

The same artifice allows one to get rid of the restriction on the 
number of unknowns on the finest grid. For the situation depicted in 
figure 3, for example, one can imagine ficticious coarse grid points. The 
boundary conditions on the fine grid are incorporated into the operator, 
as in reference 2, so that for points (IF,JF) on the right boundary, for 
example, Rip+j = Qip+i^jp = Tip+j jj. = 0 in equation (8). The bound- 

ary of the coarse grid doesn’t coincide with the boundary of the fine 
grid, but the boundary conditions will be picked up by the formation of 

i" Jlr 

An example of an extreme case of this artifice is the situation in 
which one wants to solve a Dirichlet problem on a given irregular region. 
One proceeds by embedding the region in a rectangle, writing down dif- 
ference equations at points interior to the region. These difference 


254 



equations incorporate the Dirichlet data on the boundary of the region in 

such a way that there is no coupling between the interior points and the 

other points . At the other points one writes down an equation 

a. . U. . = F, where a. . ^ 0 and F. . are arbitrary. This artifice 
i.J i.J 1.3 1.3 1.3 

makes the problem logically rectangular. The solution to the difference 

equation is obtained at the interior points, and the solution j ~ 

F. Jq, . is obtained at the other points. On a serial machine, this 
1.3 1.3 

process for solving irregular region problems may be inefficient for some 
regions, since the number of other points can be quite large. On a vector 
machine, however, the situation isn't clear, since the embedding technique 
is immediately vectorizable and since other techniques may vectorize with 
difficulty. 

One disadvantage to the black box method is storage. In the situa- 
tion that the coefficients of the difference equations are easy to compute 
(for example, Laplace's equation on a rectangle), there is a storage 
penalty of at least five [seven] locations per fine grid point for the 

black box method for a five [nine] point operator; this assumes that the 

M 

right hand side is stored and that is computed and stored. If 

lx , Jx 

one is not going to restrict the number of unknowns on the finest grid, 
however, then not storing the coefficients means additional programming 
and checking for special cases. (If the checking involves an IF test in 
the inner loop of a double DO loop, the degradation in run time can be 
dramatic on a machine like a CDC 7600.) Moreover, we are more interested 
in problems like equation (5), where the coefficients of the difference 
equations are not easy to compute and have to be stored anyway. 

If we assume that the finest grid coefficients are stored, then there 
is still a storage penalty for the black box method. First, even in the 
case that the operator on the finest grid is a fine point operator, nine 
point operators are generated on the coarser grids. If it is assumed that 
the given problem can be worked with five point operators on the coarser 
grids (an assumption which is not at all clear for equation (5)), then an 
extra two storage locations per coarse grid point are required, for a 

total of 2(1/4 + 1/16 + ) = 2/3 locations per fine grid point. 

Second, the interpolation coefficients have to be stored, requiring four 
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locations for equation (9) and its vertical analogue. (Unfortunately, 
since the coefficients in equation (9) don't necessarily sum to 1, both 

^ir+l,jr/^IF+l,JT ^IF+2,JF/^ir+l,JT stored.) Equation (10) 

requires no additional storage but does require nine multiplies. By using 

equation (9) and its vertical analogue, equation (10) can be rewritten in 

terms of coefficients of vJf+2,JF+2’ '^IF.JF+2’ ^^+2,JF’ 

reduces the computation of equation (10) to four multiplies but requires 
four storage locations per coarse grid point. Hence intearpolation as 
currently implemented in BOXMG requires a total of 8(1/4 + 1/16 +...)= 
8/3 locations per fine grid point. 

One can also ask the question of whether there is any disadvantage in 

execution time with the black box method. The worse case is the case in 

which the operator on the finest grid is a five point operator; to be 

fair, let us assume that it is not the five point Laplacian, in order that 

advantage cannot be taken of the very simple form of the coefficients in 

the five point Laplacian. To be unfair to the black box method let us 
k-1 

assume that is injection in the ’’classic** multigrid method. Experi- 

mentally, for easy equations, BOXMG achieves the reduction of the error by 
a factor of 0.1 [0.05] per multigrid cycle for lU = ID = 1 and IM = 2[ID = 
2, lU =1, IM = 3]. This is in contrast to figures of 0.25 and 0.125 for 

’’classic** multigrid. If the total work for ’’classic” multigrid and black 

k k-1 

box multigrid is computed, including the work for Ij^_^ snd Ij^ and if the 
comparison is expressed in terms of the convergence factor (convergence 
factor = reduction of error/work unit, where 1 work unit = 8 floating 
point operations, the amount of work for one Gauss-Seidel sweep on the 
finest grid), then the comparison is as follows: 


convergence 

factor, 

’’classic*’ 


convergence 
factor, 
black box 


convergence factor, 
’’classic” with 
residual weighting 


lU = ID = 1, IM = 2 0.66 

lU = 2, ID =1, IM = 3 0.64 


0.66 0.75 

0.66 0.74 


Thus there is no penalty in convergence factor for the black box method. 

There is a penalty, however, for the black box method in that the computa- 
k k— 1 

tion of Ij^ j and L is not without cost; this is startup calculation 
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time that doesn't have to be performed in the "classic" multigrid method. 

As soon as one considers the convergence factor for "classic" multigrid if 

nine point residual weighting is necessary (as it will be for all but the 

simplest problems) , then the degradation of convergence factor makes it 

obvious that the black box method can pay for its overhead. Moreover, the 

black box method will work problems that "classic" multigrid can't handle. 

If, however, one is solving just to trucation error, then the "classic" 

multigrid method is probably more efficient for problems with smooth 

coefficients. The extra expense of cubic interpolation the first time a 

grid is visited in the classic multigrid method is probably more than 

offset by the expense of computing L , k < M, in the black box method. 

k“l 

A relatively unimportant issue of implementation is whether I. = 

k ^ 

is necessary. In reference 2 a heuristic argument was made for 

this choice, but experiments seemed to indicate that the use of a fixed 

k* 1 

nine point weighting for I, did not lead to any significant degradation 

^ k k 

of convergence factor, as long as I, - = J, - and equation (7) is used to 

k" 1 ^ ^ 

define L . (In some problems, the fixed weighting even gave slightly 

k^ 1 k 

better convergence.) A nine point fixed weighting for is 

automatically correct at the boundary. Hence, since J, - is stored, it is 

k k-1 

easier to use (J, -)* for I, 
k-1^ k 

The multigrid algorithm described in section 1 begins on the finest 
M 

grid G . In the full multigrid algorithm described by Brandt (ref. 1), 
one begins on the coarsest grid instead and uses the coarser grids to 
generate a good initial guess. For three grids, for example, the pattern 
of grid transfer is G^ ^ G^ G^ G^ ^ G^ ^ G^ G^ G^ G®. In 

Brandt's scheme, when a grid is visited for the first time, cubic inter- 
polation is used instead of bilinear interpolation, and when the finest 
grid (G^ in the example above) is visited for the second time, one has the 

solution to truncation error. Indeed for equations with smooth coeffi- 

2 

cients, not only are the pointwise values h accurate but the centered 

2 

difference quotients approximate the first and second derivatives to h 
accuracy. 
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In BOXMG we have not implemented cubic interpolation nor the general- 
ization of it ((5.12) of reference 2) for equations with discontinuous co- 
efficients; the reason is that numerical experiments indicated no advan- 
tage for either versus in equations with discontinuous coefficients. 
This issue is discussed further in Section IV. 


One final issue, discovered by Brandt, is the issue of the use of the 

right hand side in interpolation. Generally, the use of the right hand 

2 

side provides an 0(h ) correction to interpolation and is not worthwhile. 
In black box multigrid, however, the right hand side at the boundary can 
contain boundary data, and in such cases, not using the right-hand side 
can lead to 0(1) interpolation errors at the boundary, and consequently 


destroy all hope of solving to truncation error in one or two cycles. 

k k 

Thus to the right hand side of equation (9) we add JF^^IF+1 JF’ 

k ic ^ ^ k 

to the hand side of equation (10) we add JF+l'^®IF+l JF-tl’ ^ 

* ^ k k 

is the residual; when a grid is visited for the first time r = F (if a 


zero initial guess is used). 


III. THE PARAMETERS OF BOXMG 

In this section we discuss the parameters the user must specify to 
use BOXMG. These are actually discussed in the comments of BOXMG follow- 
ing the reading of the parameters, but we provide a little more detail 
here. We hope this description and the examples of Section IV will make 
the usage of BOXMG clear. We had originally intended to rewrite BOXMG in 
perfect, portable Fortran. Ignoring for the moment whether such a beast 
exists, we discovered that we were phychologically incapable of the quest. 
Nevertheless, we still hope that BOXMG will prove useful and that its 
coding is clear enough to be changed by others for their devious ends. 

The grid in BOXMG is always logically rectangular. The parameters 
NXM and NYM specify the number of unknowns in the x and y coordinates 

respectively. HXM and HYM specify the x and y spacing respectively on the 

2 

finest grid; these parameters are only used in computing the discrete L 
norm of the residual, since the user specifies the equations on the finest 
grid- Indeed, since the equations on the finest grid can be written on a 
Lagrangian grid, HXM and HYM may have little meaning in some cases. 
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TOL is the tolerance. In one mode of BOXMG, iteration is continued 
2 

until the discrete L error of the residual is less than TOL or until the 
accumulated number of multigrid cycles NCYC is equal to lISTRTl. 

IFD is an indicator for the scheme on the finest grid. If IFD = 1, a 
five point scheme is assumed; otherwise, a nine point scheme is assumed. 
lU, ID, and IM have already been discussed in Section I. The recommended 

choices are lU = ID = 1 and IM = 2 or lU = 1 , ID = 2, and IM = 3. For 

problems with smooth coefficients the latter choice is slightly better; 
for problems with rough coefficients the first choice is better. If 
ISTRT < 0, BOXMG will begin iterating on the finest grid. If ISTRT > 0, 
BOXMG will begin on the coarsest grid and will bootstrap itself up to the 
finest grid, as discussed in Section II, and then continue cycling. 

IRELAX is an indicator for the type of relaxation. IRELAX = 1 means 
point relaxation. IRELAX = 2 means line relaxation by lines in x. 
IRELAX = 3 means line relaxation by lines in y. IRELAX = 4 means line 
relaxation by lines in x followed by line relaxation by lines in y. These 
options are included for flexibility. For equations like e u^^ + u^ = f, 
e << 1, (or for Au = f, where Ax » Ay on the finest grid) line relaxation 

by lines in y is needed for a good smoothing rate (ref. 2). For u^^ + 

s = f, line relaxation by lines in y is needed for a good smoothing 

rate. In some cases, both are needed. 

ITAU is an indicator for computing and printing an estimate of the 
truncation error. If ITAU = 0, then 




^M-1 -M-1 
^M ^M 




( 11 ) 


~M-1 

where I^ is injection, is computed and printed. If ITAU ^ 0, then 

eq. (11) is not computed and printed. A discussion of this feature is 
given in Section IV. 

ICOEF determines when '^k-l computed. If ICOEF = 0, 

then when M, the number of grids is computed, ICOEF will be set equal to 

M, and (J^ will be computed for k < M. If ICOEF = 1, then 

^"*1 k k k k 

must be specified for every grid, G , 1 < k < M since (Jj^_^)‘^^L will 

not be computed for any grid. This feature allows the user to run some- 
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thing like "classic" multigrid except that will still be computed by 

the code and may or may not be bilinear interpolation; hence, the use of 
this option (ICOEF =1) may lead to divergence if discontinuities of 
orders of magnitude exist in the coefficients. 

IVW, MCYCL, ALPHL, and ALPHM are cycling parameters. IVW determines 
the type of cycle to be performed. If IVW = 1, the usual V-cycle will be 
performed. If IVW = 2, W-cycles will be performed. Larger values of IVW 
give more exotic patterns. MCYCL is for the coarser grids what ISTRT is 

for the finest grid; in each cycle grid G , k < M will be visited MCYCL 

k+1 2 k-1 k 

times before grid G is visited unless the discrete L norm of f - 

j^k-l jk-1 jjk ALPHL (ALPHM if k = M) times the discrete L^ 

* k 

norm of the residual on G . The usual value of MCYCL is 1. The theoreti- 
cal value of both ALPHL and ALPHM to achieve trimcation error is 0.125. 

2 

If, however, one is solving in the mode where the discrete L norm of the 
residual is to be reduced to less than TOL, then ALPHM = 0 should be used. 
The flexibility provided by these four parameters is awesome. 

Aside from specifying these parameters, the user must provide the 
subroutine PUTF, which specifies the difference equations on the finest 
grid. (As remarked above, certain values of ICOEF would require PUTF to 
make sense for coarser grids as well..) An example of a PUTF is given in 
the listing of BOXMG in reference 7. PUTF has one argument K and a call 
to KEY in it, CALL KEY(K, JST,II, JJ,HX,HY) , which fetches the storage for 
the arrays. For IFD = 1, the user must specify the arrays FR, FA, SO, 
SOR, and QF. For IFD ^1, he must specify FSW and FNW as well. The 
logical grid is assumed to be (I,J); I =1, ..., II; J = 1, ..., JJ. The 
sets {(1,J): J = 1, ...» JJ}, {(II, J): J = 1, ..., JJ}, {(1,1): 1=1, 
..., II}, and {(I,JJ): 1=1, ..., II} are fictitious points, assumed for 
ease of programming. For IFD = 1, the template 

- FA(JP+I) 

- FR(J0+I) S0(J0+I) - FR(J0+I+1) , 

- FA(J0+I) 
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is assumed, where JO = JST(J) and JP = JST(J+1). For IFD ^ 1, the tem- 
plate 

- FNW(JP+I) - FACJP+I) - FNW(JP+I+1) 

- FR(JO+I) SO(JO+I) - FR(JO+I+l) 

- FSWCJO+I) - FA(JO+I) - FNW(J0+I+1) 

is assumed. In both cases, QF(J0+1) should be the right hand side, 1=2, 
II = II-l; J = 2, . . . , J1 = JJ-1 in PUTF. The boundary conditions 
should be incorporated into the operator, so that all coefficients re- 
ferring to fictitious points should be zero. For example, FR(J0+2), 
FSW(J0+2), and FNW(JP+2) should all be zero, J = 1, . . . , JJ. 

BOXMG automatically determines the number of grids M from the input 
parameters NXM and NYM. It does this by bisecting the given logical grid 
until it arrives at a grid which cannot be practically bisected any fur- 
ther, i.e., when the number of x or y unknowns is three or four. (For 
NXM » NYM or NXM « NYM, this may lead to 4:he situation of its being 
profitable to bisect the coarsest grid only in the x or y direction, but 
this feature is not provided in BOXMG). Once the number of levels is 
deteirmined, BOXMG computes how much storage must be allowed for the 
various arrays. If insufficient storage has been declared, a message is 
printed and the code terminates. 

The storage parameters in BOXMG are: 

NOG = maximum number of grids 

= maximum storage for NX and NY in common block DCl 

= maximum storage for NST, IMX, JMX, HX, HY and IND in common block 

GRD 

NFMAX = maximum storage for arrays Q, QF, FR, FA, SO, SOR, TOT 

NCMAX = maximum storage for arrays CIA, CIR, CISW, CISE, CINW, CINE, CIL, 

CIB 

NABDl = maximum first subscript of ABD, where ABD is the array used for 
direct solution on the coarsest grid 
NABD2 = maximum second sub j script of ABD 

If IFD = 1, the storage required for FSW and FNW is NCMAX; otherwise NFMAX 
is required. If IRELAX =1 or 2, SOS can be dimensioned to 1; otherwise, 
SOS should be dimensioned to NFMAX. 

261 


I 



IV. EXAMPLES 

The first example is for eq. (5) for Q = (0,2A) x (0,24). The 
boundary conditions are 

9u _ ( -u/2D on y = 24 or X = 24 
9v ( 0, otherwise, 

and D is given by 

^ X [0,12) U (12,20] X (12,20] 

(1000, otherwise 

We take CT = 1/3D, and f = 0 when D = 1 and f = 1 when D = 1000. The 

results are summarized in Table 4.1. In this table 1.64, -1, for example, 

“1 M-1 

is used for 1.64 x lO . Also ^ is the quantity in (11); the number r 

is the exponent in the asymptotic expansion of the error in the T's; it is 

computed by the formula log )/^-08 2. The first three rows show 

the results of running one cycle starting on the coarsest grid; the next 

2 

four rows continue from there until the discrete L norm of the error is 


less than 10 


-6 


In the last row, HXM and HYM are still .5, so that the 


region is really (0.,23) x (0.,23.); this example illustrates the picture 

k k-1 

of Fig. 3 for the transition from G toG , k = 5, 4, 3, 2. Since 
(0.,23.) X (0.,23.) is a small perturbation of (0.,24.) x (0.,24.), one 
would expect comparable results for the two cases. An example of para- 
meters is for the last row, where NXM = HYM = 48, HXM = HYM = .5, 
TOL = whatever, IFD =1, lU = 1, ID = 1, ISTRT = 20, IRELAX =1, ITAU = 0, 
ICOEF =0, IVW = 1, MCYCL = 1, ALPHL = 0.125, ALPHM = 0.125. 

The second example is the same as the first example except that 
cell-centered differencing is employed. For the runs made, the interface 
comes midway between cell centers. In one dimension, if an interface is 
located at ih and D(x) = if x > ih and D(x) = D_ if x < ih, then the 
difference equation at (i-%)h is 


"®-“-i-3/2 1/2(D^ + D_)^ “i-1/2 " 1/2(D^ + D_) “i+1/2 
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Table 4.1 



CPU Time in 

max|x? 

i.3 

Reduction in Discrete 
Norm of Residual 

Number of 

Seconds on 

in last cycle and 

Unknowns 

CDC 7600 

and Estimate of r 

number of cycles 

13 X 13 
ALPHM = .125 

.017 

2.06 

.09; NCYC = 3 

25 X 25 

.052 

6.00, -1; r = 1.61 

.09; NCYC = 3 

ALPHM = .125 




49 X 49 
ALPHM = . 125 

.170 

1.84, -1; r = 1.71 

.07; NCYC = 3 

13 X 13 
TOL = lo" 

.036 

2.06 

.13; NCYC = 8 

25 X 25 . 

TOL = 10“® 

.126 

6.00, -1; r = 1.61 

.34; NCYC = 10 

49 X 49 , 

TOL = lO" 

.430 

1.84, -1; r = 1.71 

.46; NCYC = 14 

49 X 49 , 

TOL = lO" , 

.554 

00 

1 

.23; NCYC = 9 

IVW = 2 




48 X 48 
ALPHM = .125 

.156 

2.74, -1 

.07; NCYC = 3 


a similar formula holds in two dimensions. The results are summarized in 
Table 4.2. An example of parameters for this problem is for the last row, 
where NXM = NYM = 48, HXM = HYM = .5, TOL = lO'^, IFD = 1, lU = 1, ID = 1, 
IM=2, ISTRT = 50, IRELAX = 1, ITAU = 0, ICOEF = 0, IVW = 1, MCYCL = 1, 
ALPHL = 0.125, ALPHM = 0. Maxlt^ ^lis assumed next to (24., 12.), near 
the interface and right boundary. Away from the interfaces the j*s are 
well behaved. Let us examine the answers at (24., 12.) and compare them 
with those obtained from the vertex centered scheme. By using the 
approximation to the boundary condition for a horizontal averaging and 
conservation of flux for vertical averaging, we can get approximations to 
the solution at (24., 12.) for the cell-centered scheme; call them 
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Table 4.2 


Number of 
Unknowns 

CPU Time in 
Seconds on 
CDC 7600 

max|t^ 

and Estimate of r 

Reduction in Discrete 
Norm of Residual 
in last cycle and 
number of cycles 

12 X 12 

AIPHM = .125 

.014 

4.82 

.13; NCYC = 3 

24 X 24 
ALPHM = .125 

.045 

3.73; r = .37 

.10, NCYC = 3 

48 X 48 
ALPHM = .125 

.116 

2.30; r = .70 

.08; NCYC = 2 

12 X 12 , 

TOL = 10"° 

.038 

4.87 

.14; NCYC = 10 

24 X 24 . 

TOL = 10“ 

.103 

3.75; r = .38 

.12; NCYC = 9 

48 X 48 . 

TOL = 10“® 

.354 

2.26; r = .73 

.10; NCYC = 9 


-3 -4 -5 

u , u , u (for finest grid 12 x 12, 24 x 24, and 48 x 48 respectively 
cc cc oc ^ 3^5 

and tolerance 10 ) . Let u , u , u be the answers from the vertex 

centered scheme at (24,, 12,) (for finest grid 13 x 13, 25 x 25, and 

*"RE "5 “4 

49 X 49 respectively and tolerance 10 ). Compute u = 4/3u^ - l/3u 

and = 4/3u^^ - l/3u^^. (RE stands for Richardson extrapolation.) 


Then 

53 - 

-RE 
u = 

.801, 

-4 

u 

RE 

- u = 

. 226 , and 

1 

1 

RE 

u = 

.056; 

and 


cc 

CC 


cc 

cc 


CC 

cc 



3 

u 

RE 

u = 

.766, 

4 

u 

RE 

u 

= .224, 

and u^ 

RE 

u = 

.056. 

Thus 

the 

VC 

VC 


VC 

VC 


? VC 

VC 




assumption 

of asymptotic 

error of Ch 

at (24., 

12.) for both 

schemes is 

justified, 

and — 

at least 

for this 

example — 

there 

is no 

reason 

from 


considerations of accuracy to prefer the vertex centered scheme to the 
cell centered scheme. (We have also checked points away from the inter- 
faces and boundaries, and the same conclusion — less interesting in these 
cases — is valid.) 


264 



The third example is 


{ -AU = F on fl = (0,1) X (0,1) 

U = 0 on 9fi, 

where F is chosen so that the solution is U(x,y) = 3e e ^xy(l-x) (1-y) . 
The only way one can handle such a Dirichlet problem with BOXMG is to 
incorporate the boundary data into the right hand side of the finest grid. 
Thus the difference operator along the x = 0 boundary away from the cor- 
ners is 



I.e., the boundary is not treated as part of the grid at all. To have the 
boundary treated as part of the grid in this case would have required a 
lot of special cases in BOXMG; hence, we decided not to implement this 
option. 

The results are summarized in Table 4.3. In this table, 



In this example, two cycles appears to be sufficient to solve nearly to 
truncation error in both the function values and their derivatives even 
though cubic interpolation is not employed. An example of parameters for 
this problem is for the fourth row, when NXM = NYM = 9, HXM = HYM =.l, 
TOL = lO'^, IFD =1, lU = 1, ID = 1, ISTRT = 50, ITAU = 0, ICOEF = 0, 
IVW = 1, MCYCL = 1, ALPHL = 0.125, ALPHM = 0. 

The fourth example is 

i -AU = F on n = circle of diameter 1. centered at (0., 0.) (12) 
U(x,y) = g(x,y) = 3e*^e ^xy(l-x) (1-y) , if (x,y) e 8fi, 

where F is chosen so that the solution is U(x,y) = 3e e ^xy(l-x) (1-y) . 
This example illustrates the technique of embedding. We embed Q in 
ft* = (-.5, .5) X (-.5, .5). At points in ft’\ft we write down the equation 
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Table 4.3 


Number of 
Unknowns 

CPU Time in 
Seconds on 
CDC 7600 

max| u . . - U . . 1 

i,j 

and Experimental p 

max|D u. . - . . | 

and Experimental q 

1 1 , 
max t. . 

- - 1*3 

i,J 

and Experimental r 

Reduction in Discrete 
L^ Norm of Residual 
in Last Cycle and 
Number of Cycles 

9 x 9 

ALPHM = .125 

.011 

1.01, -3 

1.13, -2 

6.26, -3 

.05; NCYC = 3 

19 X 19 
ALPHM = .125 

.032 

2.53, -4; p = 2.00 

3.15, -3; q = 1.84 

8.17, -4; r = 2.93 

.03; NCYC =s 3 

39 X 39 
ALPHM = .125 

.109 

6 . 37 , -5; p = 1.99 

8.37, -4; q = 1.91 

7.28, -5; r = 3.48 

.02; NCYC = 3 

9 x 9 

TOL = lO"^ 

.018 

1.02, -3 

1.12, -2 

6.27, -3 

.09; NCYC = 5 

19 X 19 
TOL = lo'^ 

.052 

2.56, -4; p = 1.99 

3.15, -3; q = 1.83 

8.16, -4; r = 2.99 

.11; NCYC = 6 

39 X 39 
TOL = 10‘^ 

.174 

6.42, -5; p = 2.00 

8.37, -4; q = 1.91 

7.24, -5; r = 3.49 

.11; NCYC = 6 

39 X 39 

ISTRT = 1 

.082 

7.48, -5; p = 1.76 

9.82, -4; q = 1.76 

6.99, -5; r = 3.54 

.01; NCYC = 2 



u = 0.; at points in Q whose norths south, east, and west neighbors are in 
n, we use the usual five point Laplacian. For simplicity we use the 
simplest treatment of points that don’t fall into either of the above 
sets. Consider, for example, a point ^ in Cl whose neighbor j is 

not in Qy and let the distance from ^ j to 8Q be 0h. Approximating 
-h^U by (-U^ . . + 2U^ . - U?. . .) and using the relation g((i+0)h,jh) = 

(1-0)U“ . -P0U“.. to solve for 13^. - . gives the following difference 

^ i,j x+r,j x+x,j ^ ® 

equation at (ih,jh): 


-u“ . 

1-1, 


C3 . i)«« 


1,1 


u” . , 


^ +1 
i>J+l 


= h^F(ihJh) + ig((i+a)h,jh); 


(13) 


note that there is no coupling between (ih,jh) and ((i+l)h,jh). 

The results are summarized in Table 4.4. 

The fifth example is the same as the fourth example except that in 
this case we use mapping to solve it. That is, we map the boundary of Q 
onto the boundary of Q" = (0,1) x (0,1) giving x and y as a function of 4 
and n on and solving approximately the problem 


^ = o» (4>n) e 
X = x(4,n)> (4>n) e 

Ay = o» (l»n) e ^ 

y = y(4,n), (l.n) e afi" 


(14) 


We do this by discretizing ft", 
Laplacians and specifying 

x(ih,l) = ^ cos(^ - ih^) 
x(ih,o) = ^ cos(^ + ih^) 
x(o,ih) = I cos(^ - ih^) 


approximating eq. (14) by five point 

y(ih,l) = I sin(^ - ih|) 
y(ih,o) = i sin(^ + ih^) 
y(o,ih) = i sin(^ - ih^) 
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Table 4.A 


Number of 
Unknowns 

CPU Tine in 
Seconds on 
CDC 7600 

»ax|u^ . - U.^l 
and Bxperinental p 

•“'““iJ-ax'i.j' 

and Experimental q 

naxlt 1 

i »j 

and Experimental r 

Discrete 

L^ Norm of Error 
and Estimate of P 

Keduction in Discrete 
L^ Norm of Residual 
in Last Cycle and 
number of cycles 

9 x 9 

ALPHN = .125 

.011 

3.21, -3 

2.53, 

-2 

3.15, -2 


5.37, -4 

.04; NCYC = 3 

19 19 

ALPHM = . 125 

.032 

9.31, -4; p B 1.78 

8 . 11 , 

-3; q = 1.64 

1.71, -1-, 

r = 2.44 

1.72, -4; p = 1.64 

.07, NCYC = 3 

39 X 39 
ALPHM =1.25 

.109 

7.24, -4; p = .36 

3.77, 

*3j q = l.ll 

1.62, -li 

r B .08 

1-91, -4j p = -.15 

. 10 , NCYC = 3 

9x9 

-A 

.018 

3.21, -3 

2.53, 

-2 

3.15, -2 


5.37, -4 

.04, NCYC = 6 

TOL = 10 ® 









19 X J9 

.073 

9.14, -4; p = 1.81 

7.94, 

-3; q = 1.67 

1.71, -1; 

r = -2.44 

1.66, -4; p B 1.69 

. 10 , NCYC = 9 

TOL = 10 ® 









39 X 39 

.A 

.269 

2.50, -4; p = 1.87 

2.77, 

-3; q = 1.51 

1.61, - 1 ; 

r = .09 

4.82, -5; p = 1.78 

.13, NCYC s 10 

TOL = 10 ^ 









39 X 39 
ALPHM = .05 

.134 

2.77, -4; p = 1.74 

2.80, 

-3; q = 1.53 

1.61, - 1 ; 

r = .09 

6.04, -5; p = 1.51 

. 11 , NCYC = 4 



x(l,ih) = i cos(^ + ihj) , y(l,ih) = j sin(^ + ih|) . 

Ideally, eq. (14) should be solved by multigrid, but for simplicity in 
this example we used SOR. The equation (13) transforms to the following 
in the coordinate system: 

(GiYp - ^ ~ ^ 

u(4,n) = g(4,n), (4,n) e 9ft" , 

where = §(u^y^ - u^y^) , = §(u^x^ - u^x^) , and J = x^y^ 



Equation (15) is differenced in cell«centered form. The results are 
summarized in Table 4.5. 

Since J is singular at the corners of S)**, it is not surprising that 
the error in the approximation to the x-derivative (the finite difference 
version of j(y^u^ - grows larger as the mesh is refined. For the 

fixed point (.l,.l) — the interior point nearest (0,0) on the 10 x 10 
grid — this error decreases; nevertheless the maximum error in the 
approximation to the x-derivative grows and is always assumed at a point 
nearest one of the corners of Q*' . 

The sixth example uses the mesh in Fig. U, which was the mesh used 
in a Rayleigh-Taylor calculation in Ref. 3. We include it since it is 
rather distorted (in fact, as commented in Ref. 3, a **bowtie" forms on the 
next time step) and represents a challenge to the black box approach. We 
use the same differencing as employed for eq. (15). For this example, it 
is not clear what continuous system is being approximated. If however, we 
use Dirichlet data identically equal to 1. and F = 0., then the solution 
to the difference equations is identically 1. The results are summarized 
in Table 4.6. 
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Table 4.5 
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Table 4.6 





Nuaber of 
Unknowaa 

CPU Tiae io 
in Secooda 
on cue 7600 

M«»|U , 

- U. .( Maxiaua Error 

in x'derivative 

L* Nora, 
of Error 

L* Nora, 
of Error lo 
x-derivatlve 

H»* Ill'll 

Reduction in Oincrete 
L> Nora, of Reaidunl in 
Lait Cycle and Muaber of Cvelea 

12 X 12 
ALPHM =; . 125 

.016 

1.46, 

-1 

1.29, -2 

1.95, 0 

1.50, -1 

2.98, -1 

.42, 

NCYC s 3 

12 X 12 
TOL a lO’* 

.082 

8.55, 

-7 

1.45, -5 

1.05, -5 

1.48, -6 

3.56, 1 

♦52, 

NCYC = 21 

12 X 12 
ALPHIl a .125 
IRELAX a 4 

.017 

6.34, 

, -2 

6.34, -3 

1.37, 0 

1.02, -1 

3.32, 1 

.02, 

NCYC * 3 

12 X 12 

IRELAX a 4 

.080 

1.89, 

, -7 

1.49, -8 

3.17, -6 

2.22, -7 

3.57, 1 

CM 

NCYC s 11 


-6 


TOL 3 10 



V. CAVEATS AND EXTENSIONS 

The examples of Section IV all exhibit good behavior. We begin this 
section with three examples which do not. The first is 


-Au - eu = F on Q = (0.1) x (0.1) 
Ou 


3v 


= 0 on 90 


One can solve this problem with BOXMG until e becomes too large; e is too 

2 

large when relaxation sweeps on grid G magnify, instead of reduce the 
error. The remedy would be to change BOXMG to allow the coarsest grid 
to be finer; with this remedy BOXMG could be extended to handle some 
non-definite symmetric problems. See the discussion in Section IV of 
reference 1 . 

Another example of poor behavior is for a difference operator with a 
template like 


-6 “1 
-e 2+6e -e 
-1 -e -e , 


(16) 


where £ « 1. None of the relaxation options in BOXMG provide good 
smoothing on the finest grid for such an operator. The remedy is to write 
a block relaxation routine which relaxes the strongly coupled one dimen- 
sional sets as blocks; in this case they are the southwest to northeast 
diagonals. Such a template as eq. (16) can arise in physically meaningful 
problems; see reference 8, for example. (In reference 8, however, situa- 
tions like eq. (16) would arise so infrequently as to be not worth the 
effort of special treatment.) 

A final example of poor behavior is when the difference operator on 
the finest grid is close to the skewed Laplacian (or any operator with 
strong connections like the skewed Laplacian): 

-1 -£ -1 
-£ 4(l+e) -£ 

-1 -£ -1 
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where 8 « 1. This situation was discussed in reference 3. Here we know 
of no remedy that would fit into the general framework of BOXMG. 

Several extensions of BOXMG are possible and are under investigation. 
Two, fairly straightforward, are to symmetric systems and three dimen- 
sional problems. The third, more difficult, is to nonsymmetric equations. 
The fourth is to handle equations on arbitrary regions without resorting 
to embedding. The fifth is local mesh refinement - both fixed and 
adaptive. For all except the first two extensions, it is not clear at 
this time how much of the black box philosophy can be retained, and in the 
third and fourth extensions, it is not clear if there is a uniform 
strategy for both serial and vector machines. 

Finally, we thank Achi Brandt for advice that improved this paper. 

VII. CODE LISTING 

[In, LA-UR, list.] 

A listing of BOXMG is contained in Ref. 7, which may be otained by 
requesting it from: 

J. E. Bendy, Jr. 

T-7, MS-610 

Los Alamos National Laboratory 
Los Alamos, NM 87545 
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HIGH ORDER MULTI-GRID METHODS TO SOLVE THE POISSON EQUATION 


Steve Schaffer 
Colorado State University 


I . Introduction 

This paper treats several high order multi-grid methods based on finite 
difference discretization of the model problem: 

(0) LU = F on the interior of Q. 

U — G on 

Here, L is the Laplace operator, Q the unit square, and the functions F and G 
are at least piecewise continuous. 

In section II, a fixed high order FMG-FAS multi-grid algorithm (which 
underlies each of the high order methods) is briefly discussed. In section 
III, the high order methods are described. In section IV, results are pre- 
sented on four problems using each method with the same underlying fixed FMG- 
FAS algorithm. It is noted that optimal efficiency of any one of these meth- 
ods is not attained with this fixed algorithm, the purpose of the experiments 
being, rather, to give a comparative point of view for the different methods. 


II. The Fixed High Order FMG-FAS Algorithm 

A sequence of uniform grids, on the unit square is given, with in- 

creasing mesh sizes h = 1/2, 1/4,... 1/64. For each h, the discretization of 
(0) is denoted by: 

(1) L^ U^ = F^ on the interior of 

s 

on the 

h h 

where L is a finite difference operator indexed by s that approximates L, G 

® h h h 

is the injection of G onto and F is the injection of F onto Q (or some 

weighted average of F to be described later). 

The full approximation scheme (FAS) multi-grid cycle is used to solve 
equations of the form (1) (see ref. 1). We describe it in the following to- 
gether with several multi-grid features that are used in all of our experi- 
ments. Subsequent reference to an FAS cycle will always imply L^ is second 

h ^ 

order. The grid function, u , will represent the current approximation held 
on . We first define the two-level FAS cycle by the following four steps. 
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1 . 


2 . 


Given some initial approximation, make 2 Gauss-Seidel 
relaxation sweeps on (1) using checkerboard ordering 
of the grid points (ref. 2) . 


Solve the FAS coarse grid equation on 

( 2 ) 


2h 


given by: 


L2h^2h ^ ^2h 2h^h ^ ^^2h h _ ^h^h 
s s h h s 


= G^^ 


21i 

- on the interior of Q 

- on 


Here, the grid transfer operator I, represents injec- 
2h ^ 

tion and represents "full weighting" defined by 

the stencil: 




12 1 
2 4 2 ) 
14 1 


h. 


Correct the current approximation on grid h via: 

,2h 


(3) 


h h , T-h /t,^ 

u ■<- u + (U 


y2h h. 


where the symbol represents replacement and 

represents interpolation (linear for second order 
methods and cubic for fourth order methods) . 


4. Repeat step 1 with 1 relaxation sweep using the cur- 
rent u^ for the initial approximation. 


A multi-level FAS cycle (or, simply FAS cycle) is then defined recur- 

2h 

sively by using this two-level FAS cycle to solve (2) on Q and similarily 

on Continuing in this way until equation (2) is formed on grid h = 1/2 

where the equation is then solved exactly. We demand two such cycles on each 
coarser grid, except the coarsest, before correcting the approximation on the 
next finer grid (step 3) . This is called a "W" cycle and can be represented 
by the following diagram (for h = 1/2, 1/4, 1/8, 1/16). 

h = 1/16 
h = 1/8 
h = 1/4 
h = 1/2 

where o, | and / represent, respectively, relaxation (step 1 or 4) , fine to 
coarse grid transfer (step 2) and coarse to fine correction (step 3) . 

Our high order full multi-grid (HFMG) algorithm begins on grid h = 1/4. 
Using a zero initial guess, one FAS cycle (second order) is made on (1). The 
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resulting approximate solution is then cubically interpolated to grid h = 1/8 
and is used as an initial approximation for one FAS cycle on equation (1) for- 
mulated on this grid* At this point, the final approximation is used as the 
initial approximation for one of the fourth order methods described in the 
next section. This fourth order method is then used successively on the next 
three finer grids using the cubic interpolant of the final approximation on 
the previous grid as the initial guess. The second order FMG algorithm (ref. 
3) proceeds as above without the switch to higher order. 


III. High Order Methods 

The following finite difference operators are considered. 

- The usual second order five point star operator. 

h 2 . 2 . 2 . 2 . 

- The operators 3 /3x and 3 /3y are each approx- 
imated in their respective directions by fourth 
order finite differences. For example, at a 

h 2 2 

point (x,y) e , 3 /3x is approximated by the 
stencil. 

(4) l/12h^ (-1 16 -30 16 -1), 

n , X 

when h < X < 1 - h, and by 

(5) l/12h^ (10 -4 14 -6 

when X = h. The symbol ” '' marks the central 

coefficient. 

- The operator used in Mehrstellen Verfahren. A 
second order finite difference operator is used 
with a weighted average of F to produce the fourth 
order equation: 


(6) l/6h^ (4 -20 4), U“(x,y) = 1/12 (1 8 1) F(x,y) 

14 1 ^ 1 ^ 

- This operator agrees with at all points ex- 
cept those whose distance to the boundary is h. 
At these points, where noncentral differencing 

occured in the Mehrstellen Verfahren (6) is 
used. 
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The fourth order methods are described on a given grid h where the ini- 
tial approximation is obtained by the HFMG algorithm. Relaxation will always 
refer to Gauss-Seidel with checkerboard ordering of the points. 


MG2 - The second order FMG-FAS algorithm on equation (1) , 
which is used here for comparison. 


MGW2 - The second order FMG^FAS algorithm using the full 
weighting of F for F in equation (1). 


H - High order relaxations. Make three relaxation 
sweeps on; 

(7) 

s 

where s = 2, 3 or 4. Next, solve the coarse grid 
equation: 


( 8 ) 


^2h ,,2h ^2h ^2h h ^ ^^2h 

L- U = L- I, u + II, 

1 1 h h 



h h. 
L u ) 
s 


using an FAS cycle (second order) . Then correct 
the current approximation, u^, using equation (3) 
and make one relaxation sweep on (7) . 


D - Outerloop defect corrections (ref. 4). Make one 

relaxation sweep on the second order equation (1) . 
Form the equation: 


(9) 


-h-h _h,^h h -h h 
L-u = F +L-U - L u 
1 Is 


s = 2, 3 or 4 


on grid h and make an FAS cycle on (9) using only 
one relaxation sweep on grid at step 1. 

T - The method of x-extrapolation is based on the as- 
sumption that the second order truncation error 

function, has the local expansion: 


(10) U - = Ah^ + O(h^). 


where the function A is independent of h. Extrap- 
olating, we obtain: 


( 11 ) 


2h 

T 



- T^) + O(h^). 
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If the grid function, u , is reasonably close 
to the second order solution of (1), then it 
is not difficult to show that if we define: 


( 12 ) 


2h 4 ,,_2h ^2h h ^2h. h 

T, = T t (Li Iv u - F ) - IL (L^ u 


F^l. 


then 


(13) 


2h 4 , 2h h, -.,4. 

_ (X . T ) = 0(h ) 


Thus, it follows from adding to the grid 2h 

version of (1) that a fourth order equation is 
produced there. The method, then, proceeds as 
follows. Make one relaxation sweep on the second 
order equation (1) . Form the extrapolated grid 
2h equation: 


(14) 


^2h y2h ^ p2h ^2h 


and solve by an FAS cycle . Then correct the cur- 
rent approximation, u^, using equation (3). At 
this point, a relaxation sweep on the second order 
equation (1) is performed only to smooth out the 
error for further extrapolations on the next finer 

grid. Otherwise, the corrected u will be the fi- 
nal approximation for the x-extrapolation method. 

Wt - The method of weighted x-extrapolation is exactly 

the same as x-extrapolation except that F^ is de- 
fined by the full weighting of F given by: 


h 1^21 

(15) F'^(x.y) =^(2 4 2) 

^^121 


b/2 


F(x,y) = F(x,y) + B(x,y)h^ + O(h^) 


at points (x,y) e ^ . The function B(x,y) is in- 
dependent of h. The same arguments used in x- 
extrapolation carry over here owing to the simi- 
larity of the expansions in (10) and (15), 


IV . Numerical Results 


In our 
ted and the 


experiments, the solution to the continuous problem was preselec 
functions F and G were defined accordingly. 
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PROBLEM 1 (smooth solution) 

U(x,y) = sin (ttx) sin (2ivy) 

PROBLEM 2 (oscillatory solution) On grid h = 1/16 the solution con- 
tains an average of 4.2 grid points per wavelength. 


U(x,y) = sin (7t(7x + 96)) 

PROBLEM 3,4 (jump discontinuities in the n^^ derivative of the solu- 
tion) Using the functions 

T(x,y) = y - + X - .75 


and 


C(x,y) 


.1 T(x,y) > 0 
-1 T(x,y) < 0 


we define 


U(x,y) = EXP (C(x,y) T"(x,y)) 

where n = 2 in problem 3 and n - 4 in problem 4. The dis- 
continuity lies along a parabola which passes through the 
central grid point for all grids and two boundary grid 
points on all but the coarsest grid. 


Operation counts were made for every step of the multi-grid algorithm, 
including residual formation and transfer and interpolations, where one mul- 
tiplication was counted as two additions. For each method, the fixed algor- 
ithm accumulated operations to obtain an approximation on grid h, where 

is the total number of points on grid h. We report the constant C occur- 
ring for each method. 


Method 

MG2 

MGW2 

T 

Wt 

H44 

H49 

H99 

D44 

D49 I 

D99 

C 

71 

85 

85 

100 

141 

141 

148 

107 

107 

117 


In the experiments, the weighted discrete L^-norm of the true error, 

h h ^ 

E = U - u , taken at the end of the iteration on each grid h, is given by: 

A(h) = ( • h 

j ^ 

where the summation is over the interior points of grid h. This discrete L^- 

norm makes the norms on different grids comparable. Table 1 lists A(h) (h = 
1/16, 1/32 and 1/64) for the 10 methods on problems 1, 2 and 4. The ratios 
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A(2h)/A(h) are also given to show the relative gain in accuracy by going to 
a finer grid* 

In problem 1, methods MGW2 and Wt give consistently smaller errors than 
methods MG2 and t, respectively. This relationship is also found, though 
less significantly, in problems 2 and 4. The operator, , gives the smal- 
lest errors in both the high order relaxation and defect correction methods 

for problems 1, 2 and 4. The operator L«, gives the largest errors. It 

^ h 

should be noted that the Mehrstellen Verfahren discretization, is special 

to our model problem, but has its generalization in the HODIE methods (ref. 
5). 


The second order errors obtained by method MG2 on problem 3 were: 

A(l/16) = .90 X 10"^, A(l/32) = .66 x 10“^ and A(l/64) = .14 x lO"^. The 
fourth order methods all gave very nearly the same errors as MG2 which would 
be expected for such a nonsmooth problem. 
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TABLE 1 - COMPARISON OF ERRORS 


PROBLEM 

METHOD 

1 

A(16) 

1 

A(32) 

1 

A(64) 

11 11 
A(16)/A(32) A(32)/A(64) 

1 

MG2 

.55(-2) 

•14(-2) 

•34 (-3) 

3.9 

4.1 


MGW2 

•58 (-3) 

mm 

«98l 

4.1 

4.0 


T 

.30 (-3) 

•19(-4) 

•12(-5) 

16. 

16. 


Wt 

•19(-3) 

•47(-5) 

•17(-6) 

40. 

28. 


«2 

.36 (-3) 

•22(-4) 

.89(-6) 

16. 

25. 


«4 

.lK-3) 


IHHI 

16. 

16. 



.26(-4) 

•14(-5) 


19. 

17. 



•75 (-3) 

.38 (-4) 

.16 (-5) 

20. 

24. 



•43(-3) 

.19(-4) 

.90(-6) 

23. 

21. 


^3 



•21(-6) 

22. 

23. 








2 

' MG2 

.17(+0) 

•39(-1) 

.95(-2) 

4.3 

4.1 


MGW2 

ran 

•15(-1) 

.39(-2) 

4.2 

1 — 

3.8 


T 

•27(+0) 

.84(-2) 

•42(-3) 

32. 

20. 


Wt 

.68(-l) 

•13(-1) 

•77 (-3) 

5. 

17. 


«2 

HB9 


•12(-2) 

11. 

15. 



•43(-1) 

mm 


14. 

14. 


«3 

.36 (-2) 

•81(-3) 

.88 (-4) 

4. 

9. 


^2 

.20(+0) 

•18(-1) 

.15 (-2) 

11. 

12. 



•42(-l) 

.66(-2) 

.42 (-3) 

6. 

16. 


^3 

•37(-l) 

mm 

n 

11. 

13. 




__L 
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TABLE 1 - COMPARISON OF ERRORS (CONT'D.) 


PROBLEM 


4 


_ 1 _ ±. Jl. ±. J:_ 

METHOD A(16) A(32) A(64) A(16)/A(32) A(32)/A(64) 


MG2 

— 

.24(-3) 

.60 (-4) 

.15 (-4) 

4.0 

4.0 

MGW2 

.14 (-3) 

.36(-4) 

.91(-5) 

3.9 

4.0 


.13 (-4) 

.94 (-6) 

.64(-7) 

14. 

15. 

W 

.lK-4) 

.67(-6) 

.36(-7) 

16. 

19. 

«2 

.59 (-4) 

.43(-5) 

.30(-6) 

14. 

14. 

«4 

.24(-5) 

.24 (-6) 

.18(-7) 

10. 

13. 

«3 

.25(-5) 

.21(-6) 

.16(-7) 

12. 

13. 

°2 

•73(-4) 

.55 (-5) 

.4K-6) 

13. 

13. 


I .17 (-4) 

•15(-5) 

j 

.12(-6) 

11. 

13. 

°3 

.58(-5) 

.44(-6) 

1 

.26(-7) 

1 13. 

17. 
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Accelerated Convergence of Structured Banded Systems 
Using Constrained Corrections* 


Karl Kneile 

Sverdrup Technology , Inc . 


1 . INTRODUCTION 

The goal of this paper is to describe an efficient iterative method 
for solving a structured banded system of equations. While the method 
was developed for a full potential flow program, it will be presented in 
general terms applicable to a wide range of problems. The central issue 
here is the solution of a large linear system of equations. The linear 
system may arise directly in the problem or may result from an iteration 
in a nonlinear problem. For large 2-D and 3-D applications, this linear 
system becomes increasingly expensive to solve directly. As a result 
efficient iterative methods have become attractive for large problems. 

In the nonlinear cases, these iterations may be effectively merged to 
improve convergence rates . 

Conventional iterative methods (Jacobi, Gauss-Seidel, ADI, etc.) 
rapidly reach a state where convergence rates are limited by the large 
eigenvalues of the system. This phenomenon is especially restrictive for 
large problems. Various approaches have been tried to accelerate 
convergence. Relaxation made some modest gains, but obtaining an optimum 
or near optimum parameter was sometimes difficult. Others tried more 
elaborate iterative methods (incomplete Grout, strongly implicit 
procedure (SIP), and SIP/conjugate gradient) with considerable success. 
However, the most dramatic improvements have been seen recently with the 
revival of multigrid concepts. 

The method presented in this paper uses a basic iteration step 
(incomplete Crout reduction), a dynamic relation step, and a multigrid 
concept of constraining iterative corrections. 


The work reported herein was conducted for the Arnold Engineering 
Development Center (AEDC), Air Force Systems Command (AFSC) by 
Sverdrup Technology, Inc., an operating contractor for the AEDC. 
Further reproduction is authorized to satisfy needs of the U. S. 
Government . 
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2. METHOD OF CONSTRAINED CORRECTIONS 


The method of constrained corrections uses a variational form of the 
problem. This variational form may be part of the problem definition or 
may be eirtificially created as described later. 

The discretized variational form may be represented as L((})), where L 
is a scalar function of the n component vector ((>. The components 
of c() are obtained by solving the n simultaneous equations 


F = 8L/9(t) = 0 


( 1 ) 


An iterative procedure (Newton's) for solving this system is 


a 5 = r 

where (2) 

5 = (t). -4>- ,r = -F. , and A = 3 f/3(J)| . = 3^L/3(})^| . 

1+1 11 11 


For linear problems the iteration process is trivial ending with the 
first iteration. 

If a variational principle is not part of the problem, one can define 


* mm 

L (6) = a 6 - 6 r 


(3) 


and use 


3l*/36 = 0 (^) 

as the variational form. This is equivalent, both here and in later 
considerations, to using 

3l(4)^+6) /36 = 0 (5) 

coupled with Newton's method if (5) is nonlinear in 6. The method of 
constrained corrections defines 6 as 


6 = Ck 


( 6 ) 
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The vector k has p < n components. The matrix C prescribes each 
component of 6 as a linear combination of the components of k. It will be 
assumed that C is of rank p (i.e., the columns of C are linearly 
independent). This is equivalent to imposing (n-p) linear combinations 
of the components of 6 as zero. That is 


* 

C 6 = 0 


(7) 


where C* is an (n-p) x n matrix. It is more convenient to use these 
constraints in the form of (6). 

Substitution of (6) into the variational form results in the system 
of equations 


C^AC k = c'^r (8) 


for the unknown k vector. The 6 vector is then obtained from (6). With 
judicious selections of C and k, the convergence rate can be 
substantially improved. 


3. DYNAMIC RELAXATION 

Consider again the basic linear system (2). When using an iterative 
method, one obtains an approximate 6 denoted by By letting C in (6) 

be the vector <S , the relaxation parameter k is then given by 

d. 

k = 6*^r/6'^A6 (9) 

a a a 

The iteration then takes the form 

(j) T =(]).+ k6 (10) 

^1+1 a 

The residual r in (9) is the original residual vector using (]). and not 

from using cb . + 6 . ^ 

a 


H. INTERPOLATED FORM 

It is easier to describe this form for one dimensional problems. The 
components of cj) are associated with a positional value along this dimen- 
sion. As mentioned earlier, conventional iterative methods rapidly reach 
an asymptotic convergence limited by the larger eigenvalues. It is well 
known that these iterations rapidly remove the smaller wavelength compon- 
ents, leaving the smooth longer wavelength components. Conventional 
multigrid methods exploit this smoothness to justify using a coarse grid 


287 



operator. This paper will also take advantage of this smoothing 
property, but will direct emphasis towards the smoothness of the 
correction vector 6. A basis vector is selected. The correction 
vector is then constrained to the following form 


6 




( 11 ) 


In actual practice the components of 6^^ are interspersed within 6 and the 
rows of B are merged with the rows of the identity matrix. The above 
representation (separated I and B) will be used to simplify notation. 

For this special case the constraints take the form 


* 

c 


5 



6 


0 


( 12 ) 


The B matrix represents interpolation coefficients for the nonbasis 
components . Solutions of 


T T 

C AC 6, = C r 

b 


(13) 


coupled with (11) will then "solve the original problem" subject to the 
constraints. The effectiveness of this constrained form depends upon the 
form of interpolation used, the smoothness of 6, and the difficulty in 
solving the new system (13). The smaller the dimension of 5]^, the 
simpler system (13) is to solve. However, more iterations are needed to 
precondition the smoothness required for effective interpolation. 

The dynamic relaxation step described in the previous section can be 
used to improve overall convergence. The relaxation factor may be 
calculated using the basis 6j^. 


rp rp T T 

= 6" (C r)/6^(C^AC)6^ 
b b b 


(14) 


5. MULTIGRID 

The constrained corrections method described in the previous section 
can be easily adapted to a multigrid concept. A nested sequence of basis 
vectors is defined by 


6=6 

o 


6 


i-1 


C. = 
1 


= C.6. 


1 1 



i = 1 , 2 , • • • ,m 


(15) 
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where 6^ represents the original fine grid and 6^ the coarsest level with 
the fewest components. The are interpolation coefficients from the 
i^h level to the (i-1) level. The above interpolations may be combined 
to form 


6 = d.6. 

O 11 


D. 

1 




c. 

X 


= D. -C. 

X-1 X 


(16) 


The constrained system of equations is 


d'^AD. 6. = o'^r (17) 

XXX X 


These equations easily lend themselves to the following iterative 
algorithm. 

A smoothing pass is made on the fine grid system. 


a 6 = r 


( 18 ) 


The (\> vector is updated by (10), whereby (18) now represents the next 
iterative pass. A compression step is taken to obtain the system 


where 

and 


*1*1 ■ ■'i 


*1 - '>i“i 

T.T 

ri = D^r 


(19) 


A smoothing pass is now made on this system. The c}) vector is again 
updated and the next iterative pass is taken at the second level 


where 

and 


^2^2 ^2 
*2 - 


'2 ■ °2' 


( 20 ) 


The process 


is repeated down through the coarsest level. 


The above describes a multigrid cycle. This cycle is then repeated 
until sufficient convergence is obtained. Many variations of the above 
algorithm are possible. A few of these are compared in Section 8. 


A computational advantage can be obtained from the nesting or 
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recursive definitions of the Dj^. The next level system can be 
calculated directly from the current level 


\h-i 


c 


T 

■ -.A.C. , 
1+1 1 1+1 


r 


i+1 



r . 
1 


( 21 ) 


It is not necessary to calculate the updated residuals at the fine 
grid level. They may be calculated at the current i^^ level and then 
compressed down one level. 


6. STRUCTURED BANDING 

Consider those problems where a quantity 4> is to be determined over a 
2-D or 3-D space. The space is discretized by an (n-j x ri 2 ) or 
(n-| X n 2 X n^) grid. The A matrix in (2) takes a structured banded 
form. That is, only a few of the diagonals of A have nonzero elements. 
The particular structure of A depends upon the approximations used in 
describing the original equations at the grid points. This paper will 
cover the details for a nine diagonal structure typical of a 2-D finite 
element approach. Adaptations to other type problems should not pose any 
difficulties. Occasional comments concerning other type problems will be 
made at appropriate places. 

The structure of A can be considered as a block tridiagonal system 
where the blocks are also tridiagonal. This simple structure allows 
computationally efficient smoothing algorithms. It is therefore 
desirable to maintain this structure through the multigrid levels. This 
imposes limitations on the interpolation matrix B. For simplicity, a 1-D 
case will be described first. The structured banded A matrix is 
tridiagonal. In order to maintain this tridiagonal structure, the 
interpolation for any point must be limited to a nearest neighbor 
principle. That is, an interpolated point is a function of only the two 
neighboring points, one on each side. This suggests a linear 
interpolation. 

Figure 1 depicts a case where the 6 vector has been smoothed. The 
corresponding C matrix has the form (Note: The rows of B and I are 


merged) 


1 1/2 







= 

1/2 

1 2/3 

1/3 

1/3 

2/3 

1 3/4 

1/2 

1/4 






1/4 

1/2 

3/4 1 



- 








290 



In the typical multigrid patterns where every other point is used, C has 
the following form for a nine point to five point compression. 

1 1/2 
1/2 1 1/2 

^ “ 1/2 1 1/2 ^ 23 ) 

1/2 1 1/2 
1/2 1 

The nearest neighbor principle for 2-D problems is shown in 
Figure 2. The coarse grid basis is represented by solid dots. The open 
circles are interpolated points. The arrows point to the basis 
components that are used for interpolation. This interpolation can be 
factored into two 1-D steps. Figures 3a and 3b show the two steps. The 
first step reduces from a 5 x 5 grid to a 5 x 3 grid. The second then 
reduces down to a 3 x 3 grid. The same principle will factor a 3-D 
problem in three steps. 


Interpolation for unequal spacing and irregular geometries is more 
involved. A convenient alternative is to interpolate as if the geometry 
were regular with equal spacings. This retains the calculations in a 
simple form. The result is a "nonlinear form" of interpolation. The 
longer wavelength information is still passed down to the coarser level. 
The interpolation errors, as with linear interpolation, are short 
wavelength in nature and are reduced with the next smoothing pass at the 
current level. 


7. SMOOTHING PASS 

One of the key elements of the multigrid algorithm is that the 
wavelength components comparable to grid size must be damped before going 
to a coarser level. Fortunately, this is the strong point of the 
conventional iterative methods. The method emphasized in this report is 
incomplete Grout reduction. Two variations are used in this report. The 
methods are identical to a complete Grout reduction with the following 
modifications during the forward pass. In the short version when zeroing 
an element below the main diagonal, all operations which modify an 
off-diagonal element are not performed. The result is a quick efficient 
iteration which damps out the short wavelength error components. In the 
long version, all operations which modify the nonzero structured banded 
elements are kept. All other operations which would modify zero elements 
are not performed. The long version has better iterative properties at 
the expense of the additional work required. For the nine point star 2-D 
case, the increase in operations is about 60%. 
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8. TEST CASE 


The preceding methods were used to solve Laplace's equation on a 
rectangular grid. Dirichlet conditions were imposed at the boundaries. 
The primary goal of this test case was to verify the method and help in 
comparing alternatives. The system (2) was obtained using isoparametric 
quadrilateral finite elements. The basic algorithm consisted of the 
following. A multigrid cycle of (m + 1) levels was used (m = 0 means 
fine grid only). The coarser levels were obtained by removing every 
other point in each dimension. Each level contained one smoothing pass. 
(The short version of incomplete Grout followed by a dynamic 
relaxation.) The results (convergence rates) are given in the number of 
work units required to reduce the error by one order of magnitude. A 
work unit is defined as the time to set up a fine grid system and make 
one smoothing pass. It was assumed that the time spent at a lower level 
was one fourth that of the next higher level. For comparison with 
conventional multigrid methods it is also assumed that the time required 
to compress down to the next lower level is equivalent to that of 
evaluating the operator at that level. For simple linear problems such 
as Laplace's or Poisson's equation, the operator evaluation should be 
quicker. However, for nonlinear problems such as full potential flow, 
the compression step will probably be faster. The rates given are 
estimates of the asymptotic rate. They were obtained by iterating until 
the rates "leveled off". In cases where convergence showed cyclic or 
erratic behavior, an average of a selected final group of iterations was 
used. Most of the results are given for 9x9, 17 x 17, and 33 x 33 
grids with equal spacings. A few results where n-) ^ n 2 and where 
Ax ^ Ay are given at the end of the section. 


Table 1 shows the convergence rates for the basic algorithm. 



m 

0 

1 

2 

3 

4 

5 

9 X 

9 

3.8 

1.1 

1.2 

1.2 



17 X 

17 

11.5 

2.1 

1.3 

1.3 

1.3 


31 X 

31 

39.7 

6.6 

1.8 

1.3 

1.3 

1.3 










TABLE 1 . Convergence Rates for Basic Algorithm 
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Without multigrid levels, the convergence rate rapidly deteriorates with 
increasing grid size. Using multigrid levels improves the convergence 
rates for each grid size, and the results appear independent of grid 
size. The table indicates that the extra coarse grids are not needed, 
but nothing is lost by the conservative attitude of using more levels 
than needed. For comparison purposes the 1.3 convergence rate is 
equivalent to an dirror reduction factor of 0.17 per work unit or 0.093 
per multigrid cycle. This table should be used as a reference for 
comparison of alternative methods given in this section. 

Table 1 considers the case where the n-) x n 2 grid points are all 
interior to the boundary. The boundary conditions had been transferred 
to the right hand side of the equations . An alternative would be to 
include the boundary points as part of the n-| x n 2 grid. The 
boundary points have no error and the neighboring points are interpolated 
using this zero error boundary. The results are shown in Table 2. 


n-|xn2 

m 

0 

1 

2 

3 

4 

5 

9 X 

9 

mM 

1.3 

1.2 

1.2 



17 X 

17 

9.2 

2.3 

1.2 

1.3 

1.3 


31 X 

31 

34.3 

9.5 

2.0 

1.4 

1.4 

1.4 


TABLE 2. Convergence Rates for Alternate Boundary Conditions 


The faster convergence at the zero level is due to the fewer non-zero 
error components. However, where multigrid levels are used, this trend 
is reversed. The difference is small though, when sufficient levels are 
used. Ease of application should probably be the deciding factor. 

Table 3 shows the results when a Jacobi ADI method is used for 
smoothing. The dynamic relaxer was applied after each of the two ADI 
sweeps. The multigrid convergence rates are about the same as the incom- 
plete Crout reduction. The main reason for choosing the incomplete Crout 
was its efficiency. It is easily programmed. For large systems it re- 
quires 5 divides (with common divisor), and 12 multiply-add combinations 
per equation. If applied to a 5 point star system obtained from 
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finite differences, the operation count is 3 divides and 6 multiply-adds 
per equation. 



m 

0 

1 

2 

3 

4 

5 

9 X 

9 


1.4 

1.2 

1.3 



17 X 

17 


3.5 

1.5 

1.4 

1.3 


31 X 

31 

76.6 

12.0 

2.6 

1.5 

1.4 

1.3 


TABLE 3* Convergence Rates for ADI Smoothing 


Several alternative methods of cycling through the multigrid levels 
were tried. The order did not significantly change the asymptotic 
convergence rates. That is, it makes no difference whether one starts 
with the fine grid and works down to the coarse or vice versa. Attempts 
at weighting the coarse grid passes were also tried. Table 4 shows 
results where the number of passes at each level varied linearly from 1 
for level 0 to 6 for level 5. 



m 

0 

1 

2 

3 

4 

5 

9 X 

9 

3.8 

1.4 

1.6 

1.6 



17 X 

17 

11.5 

1.4 

1.6 

1.7 

1.7 


31 X 

31 

39.7 

3.6 

1.6 

1.7 

1.7 

1.7 


TABLE 4. 

Convergence Rates 

for Weighted 

Passes 



Some improvement is noticed for a few cases where the number of levels 
are insufficient. However, when sufficient levels of grid are used, 
equal weighting (one per pass) is better. The degradation of convergence 
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rate is due to the additional work involved. If the rates were given on 
a "per multigrid cycle” basis, they would be about equal to those using 
one pass per level. It is important to emphasize that these are 
asymptotic rates. It was noticed that the weighted cycle was superior in 
the initial stages. This was attributed to the smooth errors present 
with the initial guesses. Such a phenomenon is likely with actual 
problems. Therefore, it is suggested that the early passes be weighted 
towards the coarser levels with later passes of one per level. 

Table 5 shows results without using the dynamic relaxer. 



TABLE 5. Convergence Rates Without Dynamic Relaxer 


While the relaxer made significant improvements at the zero level, only 
modest gains were achieved when multigrid levels were used. Since its 
cost is minimal, the dynamic relaxer was retained in the basic 
algorithm. Its potential gain for other applications may be 
significant. For example, a 25? savings in time was obtained in the 
multigrid/ADI cases. For comparison purposes, a fixed optimum parameter 
was determined by trial and error. Convergence rates for the dynamic 
relaxer and the fixed optimum were essentially the same. 

Simultaneous relaxation parameters were also tried. The corrections 
at each level were saved and used as columns of C in (6). The system of 
equations (8) can then be solved for the relaxation parameters 
(components of k) . The results were disappointing. Only trivial gains 
were noticed, not worth the extra work and storage required. 

An interesting alternative is to use a constant times the residuals 
as the smoothers . The dynamic relaxer can be used to determine the 
unknown constant. Convergence was erratic with rates in the 2.0 to 4.0 
range. For linear problems, this rate would be attractive since the time 
per iteration is minimal. For nonlinear problems, calculating the fine 
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grid system and compression to coarser levels takes most of the time, and 
the overall rates would be considerably slower. A problem with this 
alternative is the relative scaling of each equation. 

Several rectangular grids were also tried. The convergence rate for 
a 9 X 33 grid was in the 1.2 to 1.3 range that was obtained for the 
square grids. The incomplete Crout smoother is order dependent. That 
is, different results are obtained depending upon whether the grid is 
numbered by rows or by columns. The convergence rates for the 9 x 33 
grid were essentially unaffected by the direction of node ordering. 
Ordering along the short dimension gave less than a 2 % improvement over 
the other direction. 

Figure 4 shows results for varying aspect ratios. A 33 x 33 grid was 
used in this study. The solid line shows the results using the short 
incomplete Crout reductions . The nodes were numbered in the Y 
direction. As the figure indicates, the convergence rate rapidly becomes 
impractical for even moderate aspect ratios . Numbering the nodes in the 
X direction gave the same behavior. The dashed line shows results for 
the long version of incomplete Crout reduction (nodes numbered in the Y 
direction). The worst convergence ratio occurs at an aspect ratio of 
about 10. About twice as many iterations are needed at this aspect 
ratio. At larger ratios, the convergence rate rapidly improves. 

Ordering the nodes in the X direction gives the results shown by the 
dotted line. Except for a small increase at the smaller aspect ratios, 
this ordering gave better results. The unexpected improvement at large 
aspect ratios is probably due to the regular rectangular geometry. 
Extrapolation of these results to more general geometries would be 
speculative. Further study is needed particularly in the selection of 
the smoother. 


9 . SUMMARY 

A constrained corrections algorithm was described in the previous 
sections. The method was used to solve Laplace's equation on a 
rectangle. A convergence rate of 1.3 fine grid work units per decade 
reduction in error was obtained. 

The algorithm uses a multigrid concept with the following 
components . 

(1) Incomplete Crout reduction is used to smooth the errors. 

(2) A dynamic relaxation parameter is used. 

(3) Coarse grid systems are obtained by constraining the corrections 
at the fine grid level. These constraints are in the form of 
simple interpolation. 
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The method has some drawbacks. The system of equations needs to be 
stored. Recalculation of the fine grid at each level would increase the 
computational effort by a factor approximately proportional to the number 
of levels used. For nonlinear problems, updating the nonlinear parts can 
only be accomplished at the fine grid level. Another drawback occurs 
with the simple forms typical of finite difference methods. For example, 
a 2-D finite difference method usually uses a 5 point star rather than 9 
points. The interpolation used in this paper will not maintain this 5 
diagonal system, but expands to a 9 diagonal system. 

The main advantage of the method is the influence of the 
interpolation formulas. The coarse grid systems contain not only the 
"average residuals", but fine grid geometry information and the implied 
interpolation of the solution back to the fine grid. It is expected that 
this unification between the multigrid phases will prove advantageous 
when general distorted geometries are used. The method is easy to use 
and does not require guesswork for determining parameters. Simple 
interpolation forms are used, producing efficient iterations. 

It is the opinion of the author that the advantages will outweigh the 
disadvantages . A 3-D full potential program is being developed using the 
method in this paper. 
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10. NOMENCLATURE 


A 

Ai 

B 

Bi 

C 

Ci 

c* 

Bi 

F 

k 

L 

L» 

m 


ni,n2,n3 

r 


6 


6 

a 



6 . 

X 




Coefficient Matrix for a linear system of equations 
Coefficient Matrix at i^^ multigrid level 
Interpolation coefficient matrix 

Interpolation coefficient matrix from level i to level 
(i-1) 

Augmented interpolation coefficient matrix 

Augmented interpolation coefficient matrix from level 
i to level (i-1) 

Matrix defining linear constraints implied by C 

Interpolation coefficient matrix from level i to fine 
grid 

Vector defined by 8 l/3()) 

Vector of unknowns in constrained correction 
formulation 

Variational form 

Alternate variational form 

Maximum number of levels used (m = 0 means fine grid 
only) 

Number of nodes in a given direction of the grid 

Residual vector 

Residual vector at i^^ level 

Correction vector 

Correction vector approximation 

A basis vector used for interpolation 

Basis vector at the i^^ level 

Solution vector 

ith iteration for the solution vector 
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( )T 

Transpose operator 

3l/3(|) 

A vector composed of derivatives of L with respect to 
elements of (p. 

aF/a<t) 

Matrix composed of derivatives of the elements of F 
with respect to elements of (|). The F components 
determine rows and the <p components determine 
columns . 

a^L/3(j)^ 

Alternate notation for aF/a4> . The matrix composed 
of second derivatives of L with respect to elements 
of <(). 
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• BASIS POINTS O INTERPOLATED POINTS 

• <-o-> • <-o-» • 

• <~o-> • -eo-> • 

(a) step #1. (b) Step #2 

Figure 3.- 2-D Interpolation in Split Form. 
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